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Preface 



Computer simulation as well as numerical modelling and optimization are be- 
coming a commonplace in contemporary engineering and science. The com- 
plexity of the systems considered in various fields has been constantly growing 
over the years and the theoretical models offer more and more accurate de- 
scription of the physical phenomena, structures and devices. However, most 
of these models are far too complicated to be handled through analytical 
solutions; computer simulation is required for a majority of real- world appli- 
cations not only to evaluate the model but also to exploit it in the design 
process. Advanced state-of-the-art commercial simulation software packages 
are available and used in everyday design work in mechanical engineering, 
civil engineering, aerospace industry, electrical engineering, and many others. 

Computational optimization has become an essential and, in many cases, 
critical component of the design process. In almost all applications in engi- 
neering and industry it is necessary to maximize performance and efficiency 
while minimizing the cost, size, weight, or energy consumption at the same 
time. This is usually a complex task that involves manipulation of available 
design parameters in order to find satisfactory values of one or more objec- 
tives that are evaluated through often computationally expensive computer 
simulation. In many cases, complex constraints have to be satisfied in the 
optimization process. 

There are several factors due to which the search for optimal design can be 
complicated even further. One of them is the presence of uncertainties that is 
common for most real-world systems. In particular, material properties and 
geometry of the manufactured device may differ from their nominal values as a 
result of fabrication tolerances. Therefore, the optimization process may seek 
for the robust design which ensures the highest probability of satisfying the 
performance requirements under the presence of uncertainties rather than just 
for the optimal design. Many optimization problems are nonlinear and NP- 
hard, that is, the solution time for finding optimal design grows exponentially 
with the problem size. In some cases the designer may face multiple local 
optima and the global search procedures are necessary. On the other hand, 
many practical problems have multiple and competing objectives where the 
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best design is obtained through a decision-making process based on a set of 
Pareto-optimal solutions. 

The dependence of contemporary engineering design on computer simula- 
tions introduces additional difficulties to optimization. Growing demand for 
accuracy and ever-increasing complexity of structures and systems result in 
the simulation process being more and more time consuming. In many engi- 
neering fields, the evaluation of a single design can take as long as several days 
or even weeks so that straightforward approaches by employing high-fidelity 
simulator directly in the optimization loop are prohibitive. Interestingly, the 
increasing computational power of today's computers does not alleviate this 
problem because the availability of faster computers is compromised by the 
tendency of simulating more and more complex structures and systems with 
higher and higher accuracy. On the other hand, simulation-based objective 
functions arc inherently noisy, which makes the optimization process even 
more difficult. Still, simulation-driven design becomes a must for growing 
number of areas, which creates a need for robust and efficient optimization 
methodologies that can yield satisfactory designs even at the presence of an- 
alytically intractable objectives and limited computational resources. In par- 
ticular, any technique that improves the efficiency of simulators or reduces 
the function evaluation count is crucially important. Surrogate-based and 
knowledge-based optimization uses certain approximations to the objective 
so as to reduce the cost of objective evaluations. The approximations are often 
local, while the quality of approximations is evolving as the iterations proceed. 

Extensive research conducted in the area of computational optimization 
and modeling resulted in many techniques that alleviate the difficulties of 
traditional design optimization methodologies. Many of these techniques ad- 
dress particular issues, such as multiple local optima, multiple objectives or 
handling computationally expensive cost functions. Substantial progress has 
been observed in the development of derivative-free optimization techniques, 
the use of adjoint sensitivities, as well as methods exploiting surrogate mod- 
els, both function-approximation- and physically-based. 

This book is contributed from worldwide experts who are working in these 
exciting areas, and each chapter is practically self-contained. This book strives 
to review and discuss the latest developments concerning optimization and 
modelling with a focus on applications for solving real- world problems in var- 
ious disciplines of engineering and science, including aerodynamics, oil indus- 
try, gas and water transport, microwave engineering, structural engineering, 
navigation, civil engineering, and others. 

We would like to thank our editors, Drs Thomas Ditzinger and Holgcr 
Schaepe, and staff at Springer for their help and professionalism. Last but 
not least, we thank our families for their help and support. 

Xin-She Yang 

Slawomir Koziel 

2011 
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Chapter 1 

Adjoint-Based Control of Model and 
Discretization Errors for Gas and Water 
Supply Networks 

Pia Domschke, Oliver Kolb, and Jens Lang 



Abstract. We are interested in the simulation and optimization of gas and water 
transport in networks. Those networks consist of pipes and various other compo- 
nents like compressor/pumping stations and valves. The flow through the pipes can 
be described by different models based on the Euler equations, including hyper- 
bolic systems of partial differential equations. For the other components, algebraic 
or ordinary differential equations are used. Depending on the data, different models 
can be used in different regions of the network. We present a strategy that adap- 
tively applies the models and discretizations, using adjoint-based error estimators to 
maintain the accuracy of the solution. Finally, we give numerical examples for both 
types of networks. 
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2 P. Domschke, O. Kolb, and J. Lang 

1.1 Introduction 

Nowadays, water coming out of the tap is taken for granted in industrialized coun- 
tries. Typically, one does not consider the efforts necessary to ensure its delivery. 
Huge amounts of water have to be routed through miles of networked pipelines. 
Such complex systems are difficult to operate and cost-intensive. The same holds 
for gas supply networks. Both are supposed to work reliably and efficiently for eco- 
nomical as well as ecological reasons and play an important role in the public utility 
infrastructure. Therefore, the support of gas and water suppliers with software tools 
is of great common interest. While monitoring systems are already quite advanced, 
efficient simulation and optimization tools are only available to some extent. Of 
course, before optimization tasks can be considered, reliable simulation algorithms 
are essential. In this context, reliability implies robustness as well as trustworthy 
error estimates. 

In the field of simulation and optimization of gas and water supply networks, a lot 
of research has been done in the last years, see for example f5'.'8ll9l [T0l[T2H T3l . Usu- 
ally, especially for optimization problems, fixed models and also fixed discretiza- 
tions are considered. Existing software packages like SIMONE lfT6l allow station- 
ary as well as transient models for the simulation of gas networks. However, for 
the simulation process, one model has to be chosen in advance. SIMONE is also 
able to solve optimal control problems, but only steady state models are used here. 
In lUllSl, where nonlinear programming techniques are used to solve optimal control 
problems for gas and water supply networks, full a priori discretizations in time and 
space are applied to the underlying equations. Similarly, in 11211131 . where mixed- 
integer linear programming is proposed for gas network optimization, fixed models 
and fixed discretizations are used. Moreover, the applied discretizations are typically 
quite coarse to keep the complexity of the resulting problems treatable. 

While the application of coarse discretizations or simplified models is often ade- 
quate in many parts of the considered networks to resolve the dynamics in the daily 
operation of gas and water supply networks, no information about the quality of the 
computed solutions is provided in all mentioned approaches. In fP'A), a posteriori 
estimates for the modelling and discretization errors are introduced for finite ele- 
ment approximations. There, adjoint calculus is applied to measure the influence of 
both errors separately on a given quantity of interest. Recently, we have published 
an algorithm to adaptively control model and discretization errors in simulations for 
gas supply networks |l6l|2|. This is considered to be the first step towards an efficient 
optimization framework with reliable error estimates. Due to various similarities, 
the applied concept of adjoint-based error estimators on a network can as well be 
used for water supply networks, which are also considered here. 

This chapter is organized as follows. We begin with a description of the underly- 
ing model equations of gas and water supply networks in Section [1721 Afterwards, 
we derive error estimators for the model and discretization error with respect to 
a given quantity of interest (Section Fl. 3b . In Section [L4l we present an algorithm 
where these estimators are used to adaptively control the model and discretization 
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errors. Finally, numerical results are presented for a gas and a water supply network 
in Section [T3] 



1.2 Modelling 

In this section, we give a brief introduction into the modelling of gas and water 
supply networks. We begin with some general aspects concerning the models of 
both types of networks before we describe the particular components of each. 

1.2.1 General Aspects 

The flow of gas or water through pipelines is a directed quantity, which can be ad- 
equately described in one space dimension. Since we need to give the pipes an ori- 
entation, we model gas and water supply networks as a directed graph ^ = (^,1^) 
with arcs ^ and vertices Y (nodes, branching points). 

Typically, the set of arcs ^ mainly consists of pipes ^p C ^ , where we have 
a hierarchy of models to describe the underlying gas/water dynamics. For the com- 
putations, one of these models is chosen for each pipe in each time step. From top 
to bottom, each model in the hierarchy results from the previous one by making 
simplifying assumptions. In the case of gas networks, we have a hierarchy of three 
models, while we consider only two models to describe the flow of water. In both 
cases, the most complex model consists of a hyperbolic system of partial differen- 
tial equations (PDEs). Due to the spatial dimension, we define an interval [jc^jXy] 
with jf y < Xy for each pipe j e ^p. In the case of gas transport, the considered net- 
works also consist of compressor stations, valves and control valves, while we have 
pumping stations, valves and tanks in water supply networks. These components are 
described by algebraic equations or ordinary differential equations. Although there 
is no continuous spatial dimension for these components, we also use the spatial co- 
ordinates jty and Xy to describe states at the beginning and end of an arc j ^ J?\ ^p. 
Alternatively, we denote ingoing and outgoing states with a subscript. 

In addition to the equations on the arcs of the network, it is necessary to specify 
adequate initial, coupling and boundary conditions, which is not trivial in the case 
of hyperbolic equations (see for instance IJJ))- Let v G '^ be an arbitrary node with 
ingoing arcs 5,7 and outgoing arcs 5+. Then, Kirchhoff's first rule states that the 
sum of currents flowing into that node is equal to the sum of currents flowing out of 
that node (conservation of mass): 

E ?(-^i,o- E ?(^',o = '?(v,o v/>o (1.1) 

iG5/ ;g57 

with an auxiliary variable q(y^t), which can be used to model feed-in or demand 
(see below). 
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For a unique solution, it does not suffice to only claim jl.ll ). A further condition 
which is commonly used in practice is the equality of pressure at the node v £ 1^, 
that is, 

with an auxiliary variable p{v,t) for the pressure at the node. In water supply net- 
works, the pressure p is typically replaced by the pressure head h. 

In this work, we use equality of pressure ( 11.21 ) together with the conservation 
of mass ( II. II ). Due to the hyperbolic nature of the underlying partial differential 
equations, there is one degree of freedom for either boundary of each arc. This 
means, at a node v G Y with m ingoing and n outgoing arcs, we have m + n + 2 
degrees of freedom (including p{v,t) and q{v,t)) but only m + n+1 equations. Thus, 
we need one further equation at the node v: 

e{p{v,t),q{v,t))^0. 

At branching points in the network, we typically have q{v,t) — , and at boundary 
nodes, we use p{v,t) = Pv{t) or q{v,t) — qv{t) with given profiles pv{t) or qv{t). 
According to dl.lb . q{v,t) > corresponds to a feed-in of gas/water into the network 
and q{v,t) < to a demand. 



1.2.2 Gas Supply Networks 

In this section, we want to have a closer look on how the flow through gas networks 
is modelled. As mentioned in Sect. ll.2.Tl the network consists of pipes, compressor 
stations, valves and control valves. 



1.2.2.1 Pipes 

The models describing gas flow in pipelines are based on the Euler equations, a 
hyperbolic system of nonlinear partial differential equations. The system consists 
of the conservation of mass, momentum and energy together with the equation of 
state for real gases. The transient flow of gas may be described appropriately by 
equations in one space dimension, pressure losses due to friction are modelled via 
a source term. A common simplification, the restriction to isothermal flows, that is, 
flows with constant temperature, makes the energy equation become redundant. The 
resulting equations are called isothermal Euler equations. If we assume a constant 
speed of sound and let the pipes be horizontal, the equations result in the nonlinear 
model Q 
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Pt + ^q.^0, (1.3a) 

^r + -p.+ '-^('-) --^^. (1.3b) 

Here, q denotes the flow rate under standard conditions (1 atm air pressure, temper- 
ature of °C), p the pressure, c the speed of sound, A the friction coefficient, d the 
diameter, A the cross-sectional area of the pipe and po the density under standard 
conditions. 

Neglecting the nonlinear term in the spatial derivative of the momentum equation 
( ll.3b! ) yields the semilinear model. This simplification is motivated by the slow 
velocity of gas in real networks. We get 

2 

p, + ^q, = Q. (1.4a) 

., + ^.. = -^^. (1.4b) 

Po 2dAp 

A further simplification leads to a (quasi-) stationary model: Setting the time deriva- 
tives in ( 11.4b to zero results in an ordinary differential equation, which can be solved 
analytically: 

q = const., (1.5a) 

p{x) = y'p(;,o)2 + ^^^^(xo-x) . (1.5b) 

Here, p{xq) denotes the pressure at an arbitrary point xq £ [xy,Xy]. Setting xq ~ Xy, 
that is the inbound of the pipe, and x — x y, that is the end of the pipe, yields the 
so-called algebraic model \tl5i . 

1.2.2.2 Compressor Stations 

A compressor station is a facility that increases the pressure of the gas. Running a 
compressor generates costs, since the compressor station consumes some of the gas, 
that means, 

qom = qin - Fc {pin , Pout , ?in ) • (1.6) 

The equation for the fuel consumption of the compressor c G ^c C ^ is given by 




FciPin, Pout, qin) "^dFxq-m ^ ^ ' (^''^^ 



with Pin = p{x",t). Pout = p{4--,t), qin = q{x"c-,t) and q^ut = ^(-^,0 fTO). Here, y 
is the isentropic coefficient of the gas. The coefficient dp^c is a compressor specific 
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constant. The increase in pressure, performed by the compressor station c, is denoted 
by 

Apc{t)^ Pom-Pin (1-8) 

and depends on the compressor power 

Pc{Pm,Pom,qm) = dp.cq-m I ( ^^^^ 1 ^1 I ' ^^'^^ 




with a compressor specific constant dpx-- Either (11.8b or ( I1.9I ) is typically used as 
control variable. 

1.2.2.3 Valves 

Valves are used to regulate the flow of the gas by opening or closing. In the case of 
an open valve, the equations 

Pin — Pout 

hold. If the valve is closed, then qi„ ~ qom ~ . 

1.2.2.4 Control Valves 

Control valves, sometimes also referred to as regulators f8l, are valves that reduce 
the gas pressure by a controlled amount. The behaviour of a control valve is mod- 
elled via 

! 
Pin — Pout = U 

with control variable u — u{t). The ingoing and outgoing flow rates are identical: 

1.2.3 Water Supply Networks 

Water supply networks feature similar structures as gas supply networks. Here, the 
main components are pipes, pumps, valves and tanks. 

1.2.3.1 Pipes 

To describe the dynamics inside the pipes of a water supply network, we con- 
sider two different models, which can for instance be found in 1 1 1. The most com- 
plex model, considering the elastic effects, is given by the so-called water hammer 
equations. 
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h, + —q, = 0, (1.10a) 

q,+gAh, = -X^, (1.10b) 

a semilinear hyperbolic system of partial differential equations, where the piezo- 
metric head h and the flow rate q are the space and time-dependent state variables. 
The gravitational constant is denoted by g, a is the speed of sound in the pipe, A 
and D are the cross-sectional area and the diameter of the pipe, respectively. The 
right hand side of ( ll.lObb models the influence of friction, where A is the friction 
coefficient. 

A simplified model for the water dynamics inside the pipes can be derived by ne- 
glecting the time derivatives in jl.lOb . The resulting ( quasi- )stationary or algebraic 
model reads 

qx^O, (1.11a) 

*, = -.^. CUb) 

Thus, the flow rate is constant in the pipe, 

qm = qom, (1-12) 

and the entire pressure head loss is given by 

^in-/iout = A g|g|, (1.13) 

where L denotes the length of the pipe, and hi„ and /iout the ingoing and outgoing 
pressure head, respectively. 

1.2.3.2 Pumps 

Pumps are installed in water supply networks to generate or maintain a certain pres- 
sure. Typically, the relation between the flow rate through a pump and the resulting 
pressure increase is described by a set of characteristic curves. A commonly used 
form for a single curve (see e.g. [5]) is 

H{q) == hout -hin^Oo- ttrq' 

with 

q = q'm = <?out 

and r £ R+. If multiple curves are given, the parameters Oq and a, depend on the 
current speed CO of the pump. 
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The running costs of a pump result from the power consumption of its motor. 
Applying an efficiency curve rjdq) as in 1 14, pp. 36-37] or [9. p. 46], the costs of 
each pump c G ^c '!= ^ are proportional to 

Again, if multiple curves are given to characterize the pump, the efficiency T] addi- 
tionally depends on the speed of the pump. 

1.2.3.3 Valves 

There are various kinds of valves installed in water supply networks to control the 
flow rate and pressure. Here, we consider gate valves, where the opening can be 
externally controlled. 

With M G [0,1] being the control variable for the fraction of the opening, we apply 

u^{hin-hom) ^ ^q\q\ 

with the friction loss coefficient ^ and 

q = qin = <?out ■ 

1.2.3.4 Tanks 

Water tanks are used to store water at certain positions in the network. The ingoing 
flow at the bottom of a tank is given by 



q = Csign(/!outer - /Jinner) \/ 1 /Jouter ^ /Jinner | (1-14) 

with the discharge coefficient C. Here, /(outer denotes the outer pressure head in front 
of the inlet of the tank and /(inner is the inner pressure head at the bottom of the tank, 
which is the sum of the elevation of the tank and the current stage: 

/dinner = elevation + stage. 

Note that formally 

for each tank j & ^ . 

The change of the stage and therewith the change of /(inner is modelled by an 
ordinary differential equation: 

d 1 

-rhnner^-rq, (1-15) 

at A 

where A is the cross-sectional area of the tank. 
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Additionally to the ingoing flow at the bottom of the tank (given by ( 11.14b ). there 
can be further inflow or outflow openings, e.g. for refilling the tank or overflow. 
Concerning the model equations, those terms can be simply added to q in (11.15b and 
formally refer to q{x'^-,t). 

1.3 Error Estimators 

For given initial and boundary conditions as well as control states for the control- 
lable elements, the described model equations on the whole network can be solved 
applying appropriate discretization schemes. For the discretization of the (hyper- 
bolic) PDEs in the pipes, we apply an implicit box scheme IfTTl . which perfectly 
matches the properties of the underlying equations. The time steps of this scheme 
are also used to discretize the ordinary differential equations as occurring in the 
model of water tanks. So far, one step methods are implemented for this purpose 
and delivered satisfying results. 

Now, we are searching for a compromise between the accuracy of the numerical 
solution and the computational costs. We want to use the more complex models in 
the pipes only when necessary and to refine the discretizations only where needed. 
Using the solution of adjoint equations as done in Il3l|4l|2ll6l|71, one may deduce 
model and discretization error estimators to measure the influence of the model 
and the discretization on a user-defined target functional M. With u being the exact 
solution of the (most complex) model equations and u'' being the approximate (nu- 
merical) solution for some choice of models, the error in the target functional can 
be approximated by 

M(m)-M(m'')w77„ + 77/,, (1.16) 

where T]m estimates the model error and T]/, the error resulting from the discretiza- 
tion. Concerning the underlying adjoint equations, these error estimators are cur- 
rently implemented in a first-discretize manner, which will be briefly described in 
the following. A more detailed description with some hints on the implementation 
can be found in [71. 

Let tj (j = 0,... ,N) be the times of the discretization. Accordingly, we split up 
the solution of the discretized model equations 

Starting with the given initial state Mq, we have to solve a system of the form 

Fjiu'j_i, u))=Q (1.17) 

^old "new 

in each time step. In the adaptive algorithm described in the next section, we will 
partition the entire simulation horizon in several blocks. Accordingly, we define 



(t^)' = ((4^-i)+i)^->(«/ 



{k)> 
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for k= 1, . . . ,Nb, where j{k) is given via tja\ = Ti, for A: = 0, . . . ^Ng- For later use, 
we also define 

(£,)^ = ((f,.(,_i)+i)^---,(%))0. (1-18) 

which summarizes the state-defining equations of the block [71- 1 , 71]. 

Now, we can estimate the model error of the Jdh block with respect to the func- 
tional M via 

ri,n,k = ^Miu')AU^, (1.19) 

where AUJ^ = Uk — UJ^. Here, Uk formally denotes a reference solution in the ^th 
block which solves a different system of equations E/^ based on more complex or 
simpler models. Thus, the difference 4 C/^' results from the differences in the models 
and can be estimated by 

AUi^-{-^E,{u^))-'AEk (1.20) 

with 

AEk^E,{U^)~E,{Uk)^Ek{ul:). (1.21) 



Inserting ( 11.201 ) and ( 11.211 ) in ( 11.191 ) finally gives 

^„,, = ^±.M{u>^){±.E,{u>^))-'E,{u'i) = -^IMut) 
with 1^ being the solution of the adjoint equation 

(J-£,(«''))^^, = (J-M(«"))^ (1.22) 



Instead of -^Ei^{u^) one may also apply ■^Ei^{u'^) in (11.22b to get an error esti- 
mation. This way, it suffices to solve one system of adjoint equations (per block) for 
the estimators with respect to higher and lower models and also with respect to the 
discretization. Moreover, note that (11.22b can be solved very efficiently due to the 
special structure of £,(. and E^ (see Q). 

Regarding the derivation of the model error estimator r]„ ^ above, one may ob- 
serve that an error estimator for discretization errors can be deduced in exactly the 
same way. Here, the reference solution C4 resulting from solving a modified sys- 
tem of equations E/^ must refer to another discretization. In our implementation, the 
residual Eii{UJ^) is estimated by comparing the single terms in the applied discretiza- 
tion scheme with reconstruction formulas of higher order. This way, separate error 
estimators for the temporal and the spatial error (for each element in each block) 
can be evaluated: 

rih.,k = V.x.k + ri,,k, (1-23) 
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where T]v,a denotes the estimator for the spatial error and rj, f; for the temporal error 
in the yfeth block 16,71. 



1.4 Adaptive Error Control 

In the last section, we have developed error estimators for model and discretization 
errors. With these estimators, we can now control the computational errors inside the 
network. Since in practice the dynamic behaviour in the network varies, we want to 
control the relative error resulting from the choice of the models and the discretiza- 
tion in blocks of several time steps. Thus, we divide the time interval [0, T] into 
blocks of equal size [7i._ i,Tii], k = 1,...,Nb- Regarding one subinterval [Tj^^ i , 71] , 
we can compute the forward as well as the backward/adjoint solution and evaluate 
the error estimators locally, which yields 

Mk{u) - Mk{u'') « ?7„_i + ri,,k + T]x,i . 

Given a tolerance TOL for the relative error, we can approximate the exact error by 
the estimators, giving 

\Mk{u) - Mk{u'')\ _^ IrimM + rit^k + nxMl ^^^_^ 



\Mk{u)\ |M,(m'')| 

We first examine the discretization error to ensure the discretization to be adequate. 
Then we consider the model error. 

Check Discretization Error. First, the discretization is checked. Given the toler- 
ance TOL as above, we ensure the discretization error to be small enough by decreas- 
ing TOL by a user-defined factor < JC < 1 giving TOL/, := k" • TOL. We demand the 
discretization error estimator to satisfy 

\rit.k + 'nx.k\ <TOL,,- Mk{u' 

If the error estimator exceeds the given upper bound, the temporal and spatial dis- 
cretization errors are treated individually, that is. 



I I 1 
\rit.k\ < 2™^''' 



Mk (m" ) and | ri^,k | < 2 ^OL/, • M,, (m" 



Check Temporal Discretization Error If the temporal error estimator exceeds the 
given tolerance, the time step size is marked for refinement. After checking the 
spatial discretization error, the time interval [Tj^^i, 71] has to be computed again. If, 
in contrast, the error estimator | rj, j^ | is much smaller than the upper bound, the time 
step size is marked for coarsening. If the current time interval has to be recomputed 
due to spatial or model errors, the temporal coarsening is not applied. 
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Check Spatial Discretization Error. Now, the spatial discretization error is esti- 
mated locally for each pipe, 






Thus, we want to satisfy 



X ^^-k.i <^TOL,,- Mk{u^) 



For this inequality to hold, it suffices to claim 

i'^/p 

In order to get an upper bound for each pipe itself, we uniformly distribute the target 
functional, i.e., we divide it by the number of pipes I ^p I , giving 



I I 1 



\Mkiu'' 



vye 



If I "Hxij I exceeds the given tolerance, the pipe is marked for refinement. If, instead, 
the error estimator is much smaller than the right hand side, the pipe is marked for 
coarsening. The time interval [Ti^-i, 7].] is computed again with a finer discretization 
where needed. 

Check Total Error If the discretization error is small enough, the total error esti- 
mator T7,„ J. + r), J. + r{x,k is evaluated. If 



\r\m.,k + nt.,k + nxM >TOL. 



M,{^] 



that is, the total error does not fulfill the desired tolerance while the discretization 
error did, the model error is checked. 

Check Model Error If the discretization error is small enough, but the total error 
is not, the model errors of all pipes are checked. Again, we uniformly distribute the 
target functional over all pipes. If the error estimator exceeds the given tolerance, 
that is. 



?7™t; >TOL„ 



\Mk{u'')\ 



with TOLm := (1 — K") • TOL, the pipe is supposed to use the model above subject 
to the hierarchy. The time interval [TJ.-!,?].] is computed again with the adjusted 
models. 
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Coarsen Temporal and/or Spatial Discretization and Switch Down Models. If the 
total error fulfills the desired tolerance, the time interval [7it_i,7it] is accepted and 
k is increased. If the time step size or any pipes were marked for coarsening, the 
coarsening is applied. Then, the estimators with respect to the lower models are 
computed. If the error estimator is much less than the given tolerance, that is. 



\r{m,k,i\ <5-T0L„ 






with a "shift down factor" i' <C 1 (e.g. 10 ' or 10 ^), the pipe can use the lower 
model for the next calculations and we go on to the next interval. 



1.5 Numerical Examples 

In this section, we give numerical results for a medium sized real life gas network 
and a water supply network. All presented computations were done on an AMD 
Athlon™ 64 X2 Dual Core 6000+. 



1.5.1 Gas Supply Network 

We begin with a gas supply network, which is shown in Fig. II. II The considered net- 
work consists of twelve pipes (POl - P12, with lengths between 30km and 100km), 
two sources (SOI - S02), four consumers (COl - C04), three compressor stations 
(CompOl - Comp03) and one control valve (CVOl). 



S02 



CompOl 
P02 XX P03 







Fig. 1.1 Gas supply network with compressor stations and control valve 
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The simulation starts with stationary initial data. The boundary conditions and 
the control for the compressor stations and the control valve are time-dependent. 
Plots of the control functions are given in Fig. II. II The target functional is given by 
the total fuel gas consumption of the compressors, i.e. 

The simulation time is 86,400 seconds (24 hours) with an initial time step size At = 
3, 600 seconds. The subintervals are 7,200 seconds (2 hours) each. The initial spatial 
step size is Ax — 10,000m. The factor K is set to 10^' and the shift down factor 
s = 10^'. The tolerance TOL is set to values between 10^^ and 10^^. 
Table I l.l] shows the maximal relative error in the target functional 

\M,{u)-M,{u^)\ 
rel.err. = max • ——. , (1-25) 

k \Mk[u)\ 

the total target functional, the maximal and the minimal time and spatial step size 
used subject to the tolerance TOL and the running time. As an approximation of 
the exact solution we computed a solution with the nonlinear model and a finer 
discretization than used in the adaptive algorithm, which is shown in the last row. 

Table 1.1 Results using different values for TOL 

TOL rel.err. M(ir) max/min At max/min Ax time [s] 

le-01 1.690905e-01 5.0480603810e+01 3600/900 33,333.3/10,000 2.7e-01 

le-02 1.756343e-02 4.8408860265e+01 900/450 33,333.3/10,000 l.Oe+00 

le-03 1.288994e-03 4. 8487439 184e+01 225/28.125 16666.7/1,250 3.5e+01 

le-04 4.010694e-05 4.8486374440e+01 14.0625/1.7578 16666.7/312.5 l.Oe+03 

reference solution 4. 84854020 1 3e+0 1 1 312.5 4.2e+03 



Generally, we observe that the maximal relative error decreases with the tolerance 
TOL. We can also see that the error estimators do not provide a sharp upper bound 
for the error. 

Besides the discretization, it is also interesting how the model switching part 
works depending on TOL. Table [L2l shows how often which model is used during 
the simulation. The trend is the same as for the discretization. For smaller tolerances, 
the share of the more complex models is higher. 

1.5.2 Water Supply Network 

As a second example, we consider the water supply network shown in Fig. 11.21 
The network consists of sixteen pipes (POl - P16, with lengths between 500m 
and 20km), two suppliers (SOI - S02), six consumers (COl - C06), three pumps 
(PumpOl - Pump03), four tanks (TOl - T04) and two valves (VOl - V02). 
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Table 1.2 Models used during simulation for different values of TOL 



TOL ALG LIN NL 
le-01 100% 0% 0% 
le-02 46.7% 53.3% 0% 
le-03 13.7% 85% 1.3% 
le-04 2.5% 73.7% 23.7% 




Fig. 1.2 Water supply network with pumps, valves and tanks 



The simulation starts with approximately stationary data. The boundary condi- 
tions as well as the control for the pumps and the valves are time-dependent. As 
above, plots of the control functions are given in Fig. 11.21 The target functional is 
given by the total energy consumption of the pumps, which is proportional to 



M{u)= X 



Pc{t)dt. 



The simulation time is 86,400 seconds (24 hours) with an initial time step size At = 
3, 600 seconds. The subintervals are 7,200 seconds (2 hours) each. The initial spatial 
step size has been chosen as coarse as possible such that the applied (spatial) error 
estimators can be evaluated, that is exactly four grid points per pipe. Similar to 
above, the factor K is set to 10^' and the shift down factor s — 10^'. The tolerance 
TOL is set to values between 10^' and lO^'*. 

Table 11.31 shows the maximal relative error in the target functional according 
to ( IL25b . the total target functional, the maximal and the minimal time and spatial 
step size used subject to the tolerance TOL and the running time. As an approxima- 
tion of the exact solution we computed a solution with the time-dependent model 
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Table 1.3 Results using different values for TOL 

TOL rel.err. M(u'') max/min At max/min Ax time [s] 

le-01 1.228465e-02 8.3184539798e+01 3600/900 6666.7/166.7 5.4e-01 

le-02 1.491658e-03 8.307944378 le+01 900/225 6666.7/166.7 1.9e+00 

le-03 9.460297e-04 8.3014423132e+01 112.5/28.125 6666.7/166.7 9.7e+00 

le-04 7.54601 le-05 8.3037462607e+01 14.0625/3.5156 6666.7/166.7 1.6e+02 

reference solution 8.3038 197026e+01 2 lOO 1.5e+03 



and a finer discretization than used in the adaptive algorithm, which is shown in the 
last row. 

As above, we observe that the maximal relative error in each bloclc decreases 
with TOL. Moreover, the error tolerance is satisfied here. Obviously, the time dis- 
cretization plays the crucial role in this example. The initial spatial discretization 
is not refined. Additionally, the difference between the two models only becomes 
important for the smallest tolerance, which can be seen from Table ll.4] 

Table 1.4 Models used during simulation for different values of TOL 



TOL ALG LIN 
le-01 100% 0% 
le-02 100% 0% 
le-03 100% 0% 
le-04 42.2% 57.8% 



1.6 Conclusion and Outlook 

In this chapter, we have presented an algorithm to adaptively control model and 
discretization errors for the simulation of gas and water flow through networked 
pipelines. The gas and water dynamics in the pipes are described by a hierarchy 
of models, ranging from partial differential to algebraic equations. Further network 
components are modelled by algebraic and ordinary differential equations. Using 
adjoint equations, we introduced error estimators to measure the influence of the 
discretization in time and space and the applied models with respect to a given 
target functional. With these estimators, we developed an algorithm to adaptively 
control the different errors within a given tolerance. 

We gave examples for both types of networks to show the applicability of the 
algorithm. In both cases, it could be seen that the actual errors decreased with the 
prescribed tolerance. By construction, the error estimators do not provide an upper 
bound but are a first order approximation of the true error. For the considered water 
supply network, all error bounds were maintained. For the gas network, the actual 
errors were slightly larger than the given tolerance. 

The results achieved so far (also in (^T\) make us confident that the presented 
techniques to solve simulation tasks can build a reliable basis to address optimal 
control problems for gas and water supply networks. In particular, the sensitivity 
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information computed for the evaluation of the error estimators can be used to com- 
pute gradient information for derivative-based optimization. There, we will have to 
consider multiple quantities of interest. These are the objective function of the given 
task and all constraints, which are supposed to be evaluated within given tolerances 
as well. 

Another part of our future work is the extension of the presented approach to fur- 
ther applications. The principle of adjoint-based control of model and discretization 
errors does not stick to gas and water supply networks. For instance, other transport 
processes on networks can be considered. Here, one application we have in mind is 
traffic flow on networks. 
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Chapter 2 

Derivative-Free Optimization 

for Oil Field Operations 

David Echeverria Ciaurri, Tapan Mukerji, and Louis J. Durlofsky 



Abstract. A variety of optimization problems associated with oil production in- 
volve cost functions and constraints that require calls to a subsurface flow simulator. 
In many situations gradient information cannot be obtained efficiently, or a global 
search is required. This motivates the use of derivative-free (non-invasive, black- 
box) optimization methods. This chapter describes the use of several derivative-free 
techniques, including generalized pattern search, Hooke-Jeeves direct search, a ge- 
netic algorithm, and particle swarm optimization, for three key problems that arise 
in oil field management. These problems are the optimization of settings (pressure 
or flow rate) in existing wells, optimization of the locations of new wells, and data 
assimilation or history matching. The performance of the derivative-free algorithms 
is shown to be quite acceptable, especially when they are implemented within a 
distributed computing environment. 

2.1 Introduction 

Oil and natural gas account for around 60% of the current worldwide primary energy 
supply, and the demand for these key resources is expected to increase for several 
decades. Because the development of new fields is often very expensive and techni- 
cally challenging, it is essential that these operations are performed as efficiently as 
possible. In addition, the high expense of discovering and developing new fields pro- 
vides a substantial economic incentive to maximize production from existing fields. 
Both of these trends provide strong motivation for the development and application 
of robust methodologies for the computational optimization of oil field operations. 
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The closed-loop reservoir management paradigm fl'l provides a framework for 
efficiently operating an oil field. This approach relies on the continuous acquisition 
of field data, which are then used to calibrate the computational reservoir model. 
This represents a data assimilation or history-matching step. The resulting (history- 
matched) model is then used for optimizing future production. This can be accom- 
plished by either determining optimal settings/controls (e.g., flow rates, well pres- 
sures) for existing wells or by finding the best locations for new wells. Given the 
fact that many different types of wells can be drilled, such as deviated, horizontal 
or multi-branched wells, the determination of the appropriate well type can also be 
viewed as an optimization problem. 

In this chapter, we address three of the key optimization problems that arise in 
reservoir engineering - optimization of well settings, optimization of the placement 
of new wells, and data assimilation. Although there are inter-relationships between 
these various problems, they have important differences and are typically addressed 
in a decoupled manner. Well control optimization usually has real-valued decision 
variables, and a nonlinear, simulation-based cost function and constraints. The well 
location (often referred to as field development) problem entails, in general, finding 
the number, type, location and drilling sequence of new wells. In practice, because 
wells are associated to cell centers in the underlying simulation grid, the optimiza- 
tion variables are typically integers. The well type is described by categorical vari- 
ables. Model calibration (data assimilation) can be formulated as an inverse problem 
where we seek to minimize the discrepancy between measured data and model out- 
put. The requisite optimization usually involves a very large number of variables 
(normally at least one per simulation grid block, and in practical problems there 
are 0(10"* — 10^) blocks), so parameter reduction and regularization techniques are 
commonly applied. The subsurface flow simulations required for all of the afore- 
mentioned optimizations entail numerical solutions of sets of discretized partial dif- 
ferential equations. These function evaluations can be very costly, and this is a key 
consideration when designing the optimization framework. 

Although our emphasis in this paper is on the use of derivative-free optimization 
methods, it is important to recognize that gradient-based approaches are appropri- 
ate in many settings. In particular, when gradients are available through an adjoint 
procedure 121, these techniques can be highly efficient. Successful applications of 
gradient-based methods to oil field problems have been presented in many papers; 
see, e.g., \2^A.^(>\. 

Gradient-based approaches do, however, have some drawbacks. As a result of the 
nonconvex nature of the optimizations considered here, these problems generally 
contain multiple optima, and hence, a purely local search, which can get trapped in 
local solutions, might not be the best approach. In addition, for some problems (par- 
ticularly well placement), the optimization surface can be very rough, which results 
in discontinuous gradients. It is also important to recognize that derivative informa- 
tion is often not readily available. Adjoint-based techniques, which are a popular 
way for computing derivatives efficiently, are invasive with respect to the flow sim- 
ulator, and are therefore only feasible with full access to, and detailed knowledge 
of, the simulator source code. Numerical gradients are straightforward to calculate. 
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though this computation is expensive and may be subject to practical difficulties 
(for example, in finite differencing, the selection of the perturbation size and/or 
simulation tolerances can be problematic). Thus there is clearly a need for other, 
derivative-free, techniques for oil reservoir optimization problems. 

The derivative-free techniques considered in this work are noninvasive with re- 
spect to the flow simulator. They treat the simulator as a black-box - only cost func- 
tion values are required and no explicit gradient calculations are involved. These 
methods are therefore much easier to implement than, for example, adjoint-based 
techniques, though this advantage is counterbalanced by a significant deterioration 
in computational efficiency compared to adjoint approaches. The computational cost 
associated with derivative-free methods depends strongly on the number of opti- 
mization variables considered (in adjoint-based schemes this dependence is much 
weaker). However, most of these algorithms parallelize naturally and easily, and 
therefore their efficiency, measured in terms of elapsed time, is usually satisfactory. 

Derivative-free optimization approaches can be divided into deterministic (e.g., 
generalized pattern search) and stochastic (e.g., particle swarm optimization) tech- 
niques. Stochastic approaches can be useful for dealing with rough functions or 
functions that contain multiple local optima. Based on the computational resources 
typically available in current practice (e.g., O(IOO) cores), derivative-free optimiza- 
tion methods are appropriate when the number of optimization variables is at most 
a few hundred II710. 

Although gradient-free methodologies have been in existence for many years, 
they have become widely used in only the last 20 years or so t9J. This relatively 
recent uptake can be attributed to several factors, including the wide availability 
of large numbers of cores (combined with algorithms that parallelize easily), the 
significant theoretical results achieved in this period, and the successful applica- 
tion of derivative-free techniques in a number of areas. Examples can be found 
in molecular geometry ifTOll . aircraft design (TT1[12||, hydrodynamics (TSlIIll and 
medicine II15II16I . 

Many derivative-free stochastic schemes have also been applied within the oil 
industry. The field development problem has often been addressed by means of 
global stochastic-search techniques; see, e.g., IIT7lfT8l[T9ll20..21.l . These stochas- 
tic schemes have also been hybridized with deterministic search techniques, as 
presented in |l22l|23l|24|- Both global 1251 ED and local (deterministic) Il27ll28l 
derivative-free search techniques have been applied for well control optimization. 
The history matching problem has also been approached from both a stochastic 
point of view [29'T8 "30,3l | and using local methodologies combined with regular- 
ization and initial guess selection l.32..33i . 

Our goal in this chapter is to illustrate the applicability of derivative-free opti- 
mization methods for three types of problems arising in oil field operations. The 
examples presented are taken from Ii28.l (well control optimization), 12 ll (field de- 
velopment optimization), and If32l (history matching). This chapter is structured 
as follows. In Section IT2I we briefly describe the simulation modeling procedures 
and basic optimizers considered. Examples demonstrating the use of derivative- 
free techniques for well control optimization, field development optimization, and 
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history matching are presented in Sections 12.31 12.41 and 12.51 respectively. En- 
hancements to the basic optimization algorithms required for the target problem 
are discussed in these three sections. We end the chapter with a summary and 
recommendations. 



2.2 Basic Methodologies 

We now discuss the simulation techniques used in the optimizations, and describe 
the basic optimizers considered in this work. 



2.2.1 Simulation Techniques 

The optimization problems studied here rely on simulations of fluid flow in subsur- 
face formations. Additionally, in Section l231 equations describing wave diffraction 
tomography must also be solved as part of the inverse modeling process. These sim- 
ulations require the numerical solution of systems of partial differential equations 
(PDEs). 

In this work we consider oil-water systems. These two components exist in sep- 
arate phases, both of which reside within the pore space of porous rock. Within 
the context of oil production, the subsurface formation containing oil (and associ- 
ated water) is referred to as a reservoir. The flow of oil and water in a reservoir is 
described by statements of mass conservation combined with constitutive (Darcy's 
law) relationships that relate phase flow rates to pressure gradient. For single-phase 
flow, Darcy's law is given by u = —{k/iJ.)Vp, where u is the Darcy velocity (volu- 
metric flow rate divided by total area), k is the absolute permeability, which is a key 
property of the rock, ji is fluid viscosity and p is fluid pressure. For two or three- 
phase flow, this relationship is modified by the inclusion of the so-called relative 
permeability function, which is a scalar function of local phase volume fraction. 
Another key quantity is porosity <j), which specifies the fraction of the bulk rock 
volume that is pore space. 

In most reservoir simulators, the governing equations are discretized using a fi- 
nite volume numerical procedure. The detailed equations and discretizations can be 
found in, e.g., |34,35|. In practical applications, simulation models may contain 
0(10^ ~ 10^) grid blocks and may require several hundred time steps (the systems 
considered here are somewhat smaller). In addition, the discrete system of equa- 
tions is nonlinear and is solved using a Newton-Raphson procedure. Thus the evalu- 
ation of reservoir performance is computationally demanding. In this work we apply 
Stanford's general purpose research simulator (GPRS; ||36ll37l ) for two of the cases 
considered and the commercial streamline simulator 3DSL [38 1 for the other cases. 
The streamline simulator shares many similarities with GPRS, though it uses the 
streamlines from the total velocity field (total velocity is equal to the sum of the 
water and oil Darcy velocities) to define a coordinate system that is used to solve 
the water transport equation. This introduces some approximations but it provides 
a more computationally efficient solution than would typically be achieved using a 
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standard simulator. We note finally that, in the examples presented here, some sec- 
ondary effects (such as capillary pressure in all cases, compressibility in the stream- 
line simulations) are neglected. These effects could be included if necessary though 
they would not be expected to impact our basic findings. 

Seismic measurements involve first a number of sources, such as dynamite, air 
guns, or piezoelectric transducers, which send out elastic waves through the reser- 
voir. The transmitted and reflected waves are then recorded on geophones that re- 
spond to ground displacement or stresses. The recorded wavefields are processed 
and analyzed, and by means of a data assimilation process, such as that described in 
Section [231 can be used to infer the rock properties needed in the calculation of oil 
production forecasts. 

In this work diffraction tomography (see e.g., ||391l401l4T1l ) simulations are used 
as seismic measurements. The simulations for diffraction tomography require the 
numerical solution of the elastic wave equation, which describes the propagation of 
mechanical waves in elastic media. This equation is a statement of conservation of 
momentum, combined with the constitutive relation for an elastic material relating 
stresses to strains (Hooke's law). The velocity of the traveling waves depends on 
the elastic properties of the rock (Young's modulus and Poisson's ratio) and the 
density, which in turn depend on the rock type, porosity, and the saturations of the 
pore fluids. Rock physics models relate these rock and fluid properties to the seismic 
velocities. 

The wave equation is solved using the Born approximation [ 42II43L which is a 
perturbation method applied to the scattering of waves in inhomogeneous media. 
In that approximation, the spatial heterogeneities in elastic properties are divided 
into a smooth background medium with fluctuations around the background. The 
wavefield is also divided into an incident wavefield traveling in the background 
medium along with a scattered wavefield from the heterogeneities. The contribu- 
tions from the scattered field are expressed in terms of an integral which is computed 
numerically. 



2.2.2 Optimization Problem Statement 

A general single-objective optimization problem, as is addressed in this chapter, can 
be stated as: 

min fix] subjectto b(x)<0, (2.1) 

where / (x) is the objective function (e.g., negative of net present value (— NPV) or 
norm of discrepancy between measurements and model output), x £ W is the vector 
of control variables (e.g., sequence of well pressures, locations for each well, or 
calibration parameters), and g : R" ^ M'" represents the nonlinear constraints in the 
problem. Bound and linear constraints are included in the set Q C M". As indicated 
above, the objective function (and constraints, in some cases) are computed using 
the output from a simulator. 

Though the optimization problems considered in this work share some common- 
alities, there are important distinctions between them. Well control optimization is 
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in most cases formulated in terms of continuous variables and includes nonlinear, 
simulation-based constraints. Previous studies demonstrate that this problem often 
displays multiple solutions with comparable cost function values 14411281 . For that 
reason, this optimization is usually addressed using local search optimization tech- 
niques. By contrast, the optimization landscapes found in field development prob- 
lems can be very rough [20 1, and this motivates the use of global search approaches. 

As is the case with most inverse problems, history matching typically involves 
more unknowns than informative measurements, which leads to an undetermined 
optimization problem. Additionally, noise in the measurements can introduce rough- 
ness into the cost function. In our application, many of the multiple optima that can 
result from history matching are not consistent with prior geological information, 
and should therefore be discarded. Strategies for finding geologically realistic op- 
tima include regularization methodologies, performing a global exploration of the 
search space, and/or selecting a proper initial guess in local optimization schemes. 
Since the number of optimization parameters in history matching can be compa- 
rable to the number of grid blocks in the simulation model, parameter reduction 
techniques, which can be interpreted in regularization terms, are extremely helpful. 
These techniques can be used to assure consistency with prior geological informa- 
tion, as described in fl31l46l . 

Discrete-valued variables are common in optimization problems in the oil and 
gas industry. Such problems cannot in general be addressed by gradient-based op- 
timizers. In some cases, however, these variables can be treated as real-valued in 
order to establish a more amenable optimization problem (in this case we say that 
the discrete-valued variable is relaxed to a real-valued variable). 

2.2.3 Derivative-Free Optimization Methods 

In this section we describe, within an unconstrained real- valued optimization frame- 
work, the derivative-free local and global methods applied in this chapter. Most of 
these procedures can be extended to cases with discrete-valued variables, bound 
and/or linear constraints and, with slightly more effort, to problems with compu- 
tationally inexpensive nonlinear constraints (in Section 12.3.11 we provide mathe- 
matically sound procedures for handling simulation-based nonlinear constraints). 
Additional enhancements of these basic methodologies are introduced for the case 
examples when necessary. It is important to note that the variants devised for dis- 
crete optimization are generally based on heuristics. In Sections 12.31 and 12.51 a 
gradient-based method, sequential quadratic programming (SQP; see Wf\ ). with 
numerical derivatives is also considered to enable additional comparisons between 
the various approaches. The SQP implementation used in this work is SNOPT BSll . 

2.2.3.1 Local Search Algorithms 

The local search techniques considered here are two different pattern search meth- 
ods: generalized pattern search and Hooke- Jeeves direct search. Pattern search op- 
timization has recently become popular as a result of the development of a solid 
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mathematical convergence theory [49 , 8 , 7 1 and of the increasing availability of par- 
allel computing resources. Pattern search schemes evaluate iteratively the cost func- 
tion in a stencil-based manner. This stencil is modified as iterations proceed, and 
convergence theory requires that the stencil size eventually tends toward zero 1 8][3. 
By using a relatively large stencil size during the first stages in a pattern search tech- 
nique, some local minima can be avoided. This strategy may endow pattern search 
with a degree of robustness against noisy cost functions. We note that pattern search 
schemes (and, in general, most local as well as global optimizers) can be accelerated 
by means of computationally inexpensive surrogates. The use of surrogates can be 
quite useful for reservoir engineering problems given the large number of expensive 
objective function evaluations that are typically required. 

Generalized Pattern Search 

Generalized pattern search (GPS; II49II50I ') comprises a family of optimization al- 
gorithms. By considering different types of stencils and various strategies for evalu- 
ating the stencil points (which is known as polling |50|), multiple GPS-based opti- 
mizers can be constructed. For unconstrained optimization, the basic GPS iteration, 
for a given stencil centered at the intermediate solution xq, is as follows. First, the 
objective function is evaluated for a number of stencil points. If some of these points 
yield cost function improvement, the current solution is updated with either the best 
point (if the full stencil is evaluated) or the first point that improves the solution (if 
an opportunistic search is used). The stencil can then be modified, but in most im- 
plementations it stays unaltered. If none of the stencil points improves on xq, then 
the stencil size is decreased. The search progresses until some stopping criterion is 
satisfied (typically, a minimum stencil size). 

The stencil should contain a generating set for M" 1 8 1 . A generating set of vectors 
has the property that, if V/ (xq) ^ 0, then at least one element of the set is a descent 
direction |8|. Though only «+ 1 points are needed to establish a generating set 
for R", stencils containing 2« elements are commonly used in GPS. We illustrate 
these two types of stencils in Figure lZlT a) and l2.ir b). 

If the stencil polling process is opportunistic then, as soon as a point improving 
on the current solution is found, the stencil is moved to that new point. Therefore, 
only a subset of stencil points will be polled at a given iteration. We show an exam- 
ple of opportunistic polling for a two-dimensional compass stencil in Figure ITTl c). 
The point in the east direction is assumed to yield improvement over xq. As a con- 
sequence, the other three points are not evaluated. 

In GPS the set of directions in the stencil remains the same at each itera- 
tion, which typically provides a coordinate or compass search, as depicted in 
Figure im a). The approach can be further generalized by iteratively varying the 
set of directions in the stencil. For example, at a given iteration the stencil for a 
two-dimensional optimization problem could be as shown in Figure l2?lT a). Upon 
polling success, the new stencil is rotated arbitrarily, as in Figure lZlT d). If the sten- 
cil is randomly selected from an asymptotically dense set of directions, the resulting 
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Fig. 2.1 Types of stencil-based searcli for a two-dimensional space: (a) positive basis with 
2n directions (compass), (b) positive basis with n+ 1 directions, (c) opportunistic search (the 
first point tried, the one in the east direction, is assumed to improve on xq; the other points, for 
which the cost function is not evaluated, are plotted with dashed lines), and (d) mesh adaptive 
compass search (the stencil changes randomly at every iteration) 



algorithm is the mesh adaptive direct search (MADS; |51|). The MADS approach 
may be beneficial in situations where the cost function is noisy JSTJ. 

If the polling process is not opportunistic (which means the cost function is eval- 
uated for all stencil points), generalized pattern search requires on the order of n 
function evaluations per iteration. However, the GPS method parallelizes naturally 
since, at a particular iteration, the objective function evaluations at the polling points 
are completely independent and can thus be accomplished in a distributed fashion. 
We note that opportunistic polling is well suited to situations where parallel com- 
puting resources are limited or unavailable. 

Hooke-Jeeves Direct Search 

Hooke-Jeeves direct search (HJDS; fSjj) is a compass-based pattern search method. 
There are two different types of moves in HJDS: exploratory and pattern. In the ex- 
ploratory move the cost function is evaluated at consecutive perturbations of the 
stencil center xq in the coordinate directions. All directions are polled opportunis- 
tically. The exploratory move resembles a numerical gradient estimation (with a 
perturbation size that may initially be large, but that eventually tends to zero). If 
no cost function improvement is found in the exploratory step (and this implies 2 m 
function evaluations), the stencil size is decreased. 

Otherwise, a new point xi is obtained, and the next exploratory move is cen- 
tered at Xq + 2 (xi — Xq) . This aggressive step in the underlying successful direction 
is the pattern move, which is somewhat analogous to a line search procedure. The 
pattern move can be beneficial in situations where an optimum is far from the cur- 
rent solution. If the new exploratory step yields no cost function decrease, another 
opportunistic compass search is centered at xi, and if, again, this search yields no 
improvement, the step size is reduced, keeping the stencil at xi. Because HJDS is 
inherently sequential, it is most appropriate for use with serial computing resources. 
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2.2.3.2 Global Search Algorithms 

The global search approaches applied in this work are a genetic algorithm and par- 
ticle swarm optimization. These techniques share some similarities as they are both 
based on abstractions of natural processes, have a markedly stochastic nature, and 
apply sequential updating of a set of solutions (population of individuals in genetic 
algorithms, swarm of particles in particle swarm optimization). 

Genetic Algorithms 

Genetic algorithms (GAs) are well known and widely used so our discussion here 
will be brief (refer to |53| for a detailed description). GAs are inspired by the the- 
ory of natural selection. An iteration starts with a population of individuals, which 
is ranked in terms of cost function (referred to as fitness in the context of GAs). 
Thereafter, a set of operators, typically selection, crossover and mutation, are ap- 
plied to generate a new population. The population size, like the swarm size in 
particle swarm optimization, has a marked impact on the performance of GAs. With 
a proper population size, a genetic algorithm can be used to explore complex objec- 
tive function landscapes, and to thus identify promising regions in the search space. 
A thorough global exploration, even for a moderate number of optimization vari- 
ables, often requires many function evaluations, and accordingly, a large population 
size. However, the cost function computation for all of the individuals can be readily 
performed in a distributed manner. 

Particle Swarm Optimization 

Particle swarm optimization (PSO; 15411551 ) was introduced by Kennedy and Eber- 
hart in the mid 1990s. The algorithm mimics the social behaviors exhibited by 
swarms of animals. At each PSO iteration, all particles in the swarm move to a new 
position in the search space. Let x,j; e R" be the position of particle / at iteration k, 
x*i^ represent the best position (solution) found by particle ; up to iteration k, and y*^ 
be the best position found by any of the particles in the 'neighborhood' of particle / 
up to iteration k. The neighborhood can include all of the PSO particles, in which 
case the algorithm is referred to as global-best PSO. Other neighborhood specifi- 
cations [56 1 limit particle communication such that particle ; interacts with only a 
subset of the swarm (this has been observed to be useful in avoiding premature con- 
vergence). The new position of particle / at iteration ^+1, x,jt+i, is computed by 
adding a so-called velocity term, v, ,t e M", to the current position x,jt II54II55|[571 : 

x^+i =x,-,i + v,:i:. (2.2) 

The velocity V/jt is in turn calculated as follows: 

va = (Oyi.k-i +ciri o (x;,(.-x,-^i.) +C2r2 (y;,(.-x,'i) , (2.3) 

where ft), ci, and C2 are weights, ri and r2 are random vectors in R" with com- 
ponents uniformly distributed in the interval (0, 1), and o denotes the Hadamard 
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(component-wise) product. Thus, we see that each particle moves to a new posi- 
tion based on its existing trajectory, its own memory, and the collective experience 
of neighboring particles. These three velocity contributions are referred to as the 
inertia, cognitive, and social components II54[|571 . 

Some constraints can be handled in PSO through use of the 'absorption' tech- 
nique Il56ll58ll59l . With this approach, particles corresponding to infeasible solu- 
tions are moved to the nearest constraint boundary, and the corresponding velocity 
components are set to zero. We should note that this constraint handling procedure 
should be accompanied by an efficient scheme for projecting infeasible points back 
into the feasible domain. When this projection algorithm cannot be applied (e.g., for 
simulation-based constraints), the penalty function approach is a likely viable alter- 
native (though this approach is not exempt from potential issues; see Section l2.3.1l ). 

2.3 Well Control Optimization with Operational Constraints 

The optimization of well settings/controls typically entails maximizing either net 
present value (NPV) or the cumulative volume of oil produced through time by 
finding the optimal well flow rates or pressures (these pressures are referred to as 
bottom-hole pressures or BHPs). In many actual scenarios, and in the cases consid- 
ered here, water is injected to drive the oil toward production wells and to maintain 
reservoir pressure. Secondary objectives could include minimizing the total volume 
of water injected or produced, or maximizing the initial oil production rate. The 
problem is usually solved subject to operational constraints, such as maximum and 
minimum BHP, maximum water injection rate, maximum well water cut (fraction 
of water in the produced fluid), etc. The optimization variables are generally real- 
valued, and the relationships between these variables and both the objective function 
and constraints are in general nonlinear. Thus, the problem can be addressed by non- 
linear programming techniques |47]. 

The production optimization cases presented here involve the maximization of 
undiscounted NPV by adjusting the BHPs of water injection and production wells 
(well flow rates could also have been the optimization variables). The objective 
function we seek to minimize is 

/ (x) = -NPV (x) ^-r^Qo (x) + c,,pQ,,p (x) + c,,iQ„, (x) , (2.4) 

where Yo is the price of oil ($/STB, where 'STB' stands for stock tank barrel; 
1 STB = 0.1590 m^), c,^.p and €„., are the costs of produced and injected water 
($/STB), respectively (produced water reduces NPV due to pumping and separa- 
tion costs), and 2„, Q^p and Q„i are the cumulative oil production, water production 
and water injection (STB) obtained from the simulator. 
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2.3.1 Constraint Handling Techniques 

The nonlinear programming methods applied here are generalized pattern search 
(GPS), Hooke-Jeeves direct search (HJDS), and a genetic algorithm (GA), with 
enhancements introduced to deal with general constraints. Consistent with the 
derivative-free spirit of this work, the constraint handling techniques considered, 
namely penalty functions and filter methods, allow us to continue treating the simu- 
lator as a black-box. These methodologies are not exclusive to gradient-free optimiz- 
ers, so they could be implemented with a wide variety of optimization approaches. 
The description below of constraint handling techniques follows the discussion pre- 
sented in |28|. 

Penalty Functions 

The penalty function method (see, e.g., 1*471) for general optimization constraints 

entails modification of the objective function with a penalty term that depends on 

some measure of the constraint violation h : K" ^ M.. The modified optimization 

problem 

min/(x)+p/t(x), (2.5) 

xen 

where p > is a penalty parameter, may still have constraints, but they should be 
straightforward to handle (for example, bound constraints). In this work we apply 
h{x) = ||g+ (x)||2, with g+ : M" -^ W" defined as gf (x) = max{0,g,(x)} (normal- 
izing the constraints can be beneficial since they are all weighted equally in the 
penalty term). If the penalty parameter is iteratively increased (tending to infinity), 
the solution of the modified optimization problem ( 12.5b converges to that of the 
original nonlinearly constrained problem. However, the sequence of values to use 
for p may require some numerical experimentation and the overall procedure can 
lead to significant additional computation. In certain cases, a finite (and fixed) value 
of the penalty parameter also yields the correct solution (this is the so-called exact 
penalty; see 1471 ). However, for exact penalties, the modified cost function is not 
smooth around the solution 1471 . and thus the corresponding optimization problem 
can be challenging to solve. 

Filter Method 

The penalty function approach is straightforward to implement but, as discussed 
above, can introduce some potential difficulties and complications. Filter meth- 
ods Il60ll47l provide an alternate and systematic approach for handling general 
constraints. A filter is a set of pairs (/; (x) , / (x) ) , such that no pair dominates an- 
other pair. The concept of dominance, borrowed from multi-objective optimization, 
is defined as follows: the point xi G K" dominates X2 G M" if and only if either 
/(xi) < /(x2) and /!(xi) < /!(x2), or /(xi) < /(xi) and /z(xi) < /i(x2). In this 
work, the constraint violation h associated to the filter method is computed the same 
way as described above for the penalty method. 
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Filters have been combined with a variety of basic optimization algorithms in- 
cluding sequential quadratic programming |60|, interior point methods f6ll, and 
pattern search techniques ll62l[63l . They can be understood as essentially an add-on 
for a basic optimization procedure. Within the context of a pattern search method, 
a filter acts to modify the standard acceptance criterion which, as discussed in 
Section 12231 is based only on cost function improvement. At a given iteration, the 
basic optimization algorithm proposes a number of intermediate solutions. These 
solutions are accepted if they are not dominated by any point in the filter. Prior 
to continuing with the next iteration, the filter is updated based on all the points 
evaluated by the optimizer. Using filters, the original problem (12.11 1 is thus viewed 
as a bi-objective optimization: besides minimizing the cost function /(x), we also 
minimize the constraint violation h{x). Using this multi-objective perspective, the 
optimization search is enriched by considering infeasible points. We reiterate that 
the ultimate solution is intended to be feasible (it may however show a very small 
constraint violation). 

2.3.2 Production Optimization Example 



This example is taken from 112811 . The reservoir is a portion of the synthetic SPE 10 
model |64|. It is represented on a three-dimensional grid containing 60 x 60 x 5 
blocks. The reservoir contains oil and water. The 25 wells (16 water injectors and 
nine producers) are distributed following a five-spot pattern (see Figure IZ2t . This 
model is similar to models used in practice except it contains fewer grid blocks. 
The variation in permeability, evident in Figure [231 strongly impacts the flow field. 




Fig. 2.2 Well configurations and top layer of the geological model considered in the produc- 
tion optimization case in Section l23l Grid blocks are colored to indicate value of permeabil- 
ity (red is high permeability, blue is low permeability). Injection and production wells are 
represented as blue and red circles, respectively (from |28| ): see online version for colors. 
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By optimizing the well settings, we can achieve a more uniform distribution of the 
injected water, thus increasing the amount of oil produced and maximizing NPV. 

Reservoir production proceeds for a total of 1460 days. The BHP of each well 
is updated every 365 days. There are thus a total of four control intervals. Since 
there are 25 wells, the number of optimization variables is 100. During each con- 
trol interval, the BHPs are held constant. Injection well BHPs are specified to 
be in the range 6500— 12000 psi and production wells are constrained to the 
range 500- 5500 psi. 

The additional constraints, which are nonlinear, specify that (1) the maximum 
field-wide water injection rate not exceed 15000 STB/day, (2) the maximum field- 
wide liquid (oil-Hwater) production rate not exceed 10000 STB/day, (3) the minimum 
field-wide oil production rate not fall below 3000 STB/day, and (4) the fraction of 
water in the produced fluid (water cut) not exceed 0.7 in any of the nine production 
wells. The oil price considered is $50/STB, and the costs of produced and injected 
water are $10/STB and $5/STB, respectively. Additional details of the problem 
specification are provided in II65I . 

Based on results for another nonlinearly constrained production optimization 
problem presented in ||28|, we apply the following four approaches for this case: 
sequential quadratic programming (SQP) with numerical derivatives and an active 
set constraint handling method [47 1, generalized pattern search (GPS) with penalty 
function, GPS with filter, and Hooke- Jeeves direct search (HJDS) with filter. The 
gradients required by SQP were computed using second-order finite differencing, 
with a perturbation size of 0.1 psi (this perturbation size was established through 
numerical experimentation - we reiterate that this can be an issue when estimat- 
ing gradients numerically). In all cases, the initial stencil size for GPS and HJDS 
was 1375 psi. The penalty method relies on some heuristics for increasing the 
penalty parameter and terminating each corresponding intermediate optimization. 
Details on the strategy used here can be found in [281. The two approaches con- 
sidered with the filter method, GPS and HJDS, do not rely nearly as directly on 
heuristics. 

The initial guess xq for all methods was the center of the orthotope given by 
the bound constraints (i.e., BHP of 9250 psi for all injectors at all times, BHP 
of 3000 psi for all producers at all times). This reference case has an associated NPV 
of $193.43 million and a constraint violation value of 0.3731. The optimization re- 
sults are summarized in Table 12.11 Consistent with the underdetermined nature of 



Table 2.1 Performance summary for the production optimization case (from 1281 ) 
Optimization approach Number of simulations Max. NPV [$ MM] h 



SQP + active set 


41004 


341.32 


0.0031 


GPS + penalty function 


60001 


342.95 


0.0000 


GPS + filter 


39201 


342.61 


0.0001 


HJDS + filter 


1618 


336.28 


0.0001 
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the optimization problem, the solutions computed by the four approaches differ. The 
NPVs for the first three methods are within 0.5% of one another, though the NPV 
for the last method (HJDS with filter) is about 1.5% less. Note that all algorithms 
except GPS with penalty function have nonzero constraint violations. For the filter- 
based methods, we allowed a constraint violation of 0.0001. Were we to require zero 
constraint violation, GPS with filter would provide an NPV of $341.12 million, and 
HJDS with filter would provide an NPV of $332.93 million. 

All algorithms other than HJDS were implemented within a distributed com- 
puting environment (67 cores were used, which provided a speedup factor of 
around 50). We therefore observe that, although SQP and GPS with filter required a 
factor of about 24 times more function evaluations than HJDS, in terms of elapsed 
time, these two methods required only about half the time as HJDS. The procedure 
that required the highest number of function evaluations, GPS with penalty function, 
needed about 3/4 of the time of HJDS. This highlights the impact of the availability 
of multiple cores on algorithm selection. We note finally that, although the results 
in Table l2.1] for GPS with penalty function and GPS with filter are similar, the filter 
method is less heuristic and may, therefore, be preferable for many problems. 

We now illustrate the degree of nonlinear constraint satisfaction provided by the 
various optimization algorithms. Figures 12.31 and 12.41 present the field-wide fluid 
production rates and the maximum of the water cut in any producer well. The red 
horizontal lines in these figures indicate the constraint value. It is evident that, at 
late time, the initial guess settings lead to constraint violations. The constraints are 
essentially satisfied by the other algorithms, with the exception of SQP. This occurs 
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Fig. 2.3 Total field-wide fluid production rate for the initial guess xq and the four solutions 
found for the production optimization case. The red line indicates the maximum total fluid 
rate allowed. GPS i and GPS2 denote GPS with the penalty function and the filter method, 
respectively (from 128|); see online version for colors. 
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Fig. 2.4 Maximum well water cut for the initial guess xq and the four solutions found for the 
production optimization case. The maximum water cut at a given time is the maximum of the 
water cut values for all producer wells at that time. The red line indicates the maximum water 
cut allowed for any producer well. GPSi and GPS2 denote GPS with the penalty function and 
the filter method, respectively (from 1281 '): see online version for colors. 



because our SQP stopping criterion does not enforce strict feasibility. SQP does, 
however, encounter solutions during the course of the optimization with lower con- 
straint violations but also with lower NPVs. Thus, it is clear that the SQP results 
could be improved if it was used with a filter. 

The quantities that directly impact NPV are displayed in Figure 12.51 where we 
show the production and injection profiles for xq and for the solution computed by 
GPS with filter. The peaks in the rates in the optimized solution, evident every 365 
days, result from the changes in the well BHPs, which occur at those times. It is 
evident that, relative to the initial guess, the optimized controls lead to a significant 
increase in cumulative oil production along with a significant decrease in cumulative 
water production (note that cumulative oil production corresponds to the integral of 
the curve shown in Figure [231 and similarly for other quantities). The cumulative 
water injection does not vary significantly between the two cases. This example 
illustrates the substantial gains that can potentially be achieved in oil field operations 
through the use of computational optimization. 



2.4 Optimal Well Placement with Particle Swarm Optimization 

The general problem of field development optimization involves the determination 
of how many new wells to drill, what type of wells these should be (i.e., injec- 
tion well or production well; vertical, horizontal or multi-branched well; type of 
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Fig. 2.5 Total field- wide production and injection rates for the initial guess xq and solution 
computed by GPS with filter for the production optimization case. Top: Oil (red) and wa- 
ter (blue) production rates. Bottom: Water injection rate (from 1281 '): see online version for 
colors. 



downhole instrumentation), and the drilling schedule, in order to maximize a pre- 
scribed objective function. In previous work, a number of gradient-based and 
derivative-free procedures have been developed and applied for this problem (see 
II20I for a full discussion). Of the stochastic search approaches employed, many 
researchers have applied genetic algorithms (e.g., l23ll66l[24ll67ll68l[T91l69ll70l ). 
though simultaneous perturbation stochastic approximation algorithms ifTTl . as well 
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as other approaches, have also been explored fSSl. Mattot et al. Il72l evaluated sev- 
eral optimization algorithms for a groundwater remediation problem and achieved 
the best results using particle swarm optimization (PSO). This motivated the use of 
PSO for optimization of oil field development in ||20| . Consistent with |72|, in 11201 
PSO was found to outperform GA for several example cases. All of these examples 
involved relatively few wells (20 or less). 

The development of large-scale oil fields, however, often involves drilling many 
wells. If we restrict ourselves for now to vertical wells (which can be either pro- 
duction or injection wells) that penetrate the entire thickness of the formation, the 
optimization variables include the areal {x,y) location of each well and a binary 
variable b defining the well type. Thus there are a total of « = 3Mv optimization 
variables, where N„ is the number of wells. Even given the restriction of fully- 
penetrating vertical wells, the optimization problem is challenging. For large-scale 
problems, Nk can be several hundred, so the number of optimization variables can 
be large. In addition, for large A^h- the imposition of well-to-well distance constraints 
(which are commonly used in field applications) can lead to a large number of in- 
feasible solutions, and this can negatively impact the performance of a population- 
based algorithm such as PSO. Another key concern is that the number of wells A^h. 
should itself be an optimization variable. Direct inclusion of A^h. as an integer vari- 
able in the set of parameters will further complicate the optimization and will lead 
to much larger computational requirements. 



2.4.1 Optimization Methodology 

In recent work, a field development optimization procedure that addresses some 
of the issues raised above was presented fST]. In this implementation, rather than 
prescribe A^„. and optimize 3Af„ parameters, the wells were constrained to be ar- 
ranged in repeated patterns (such patterns are commonly used for onshore oil field 
development). By optimizing the parameters that define the well patterns, a close-to- 
optimal N„ and the locations and types of all wells can be determined. This method 
would theoretically be expected to lead to suboptimal results relative to those that 
could be achieved by optimizing the number of wells and the associated 3 Af„. pa- 
rameters, but it is much more tractable computationally than the more exhaustive 
approach. 

In this section we describe and then apply this new well pattern optimization 
procedure and a second-stage optimization that perturbs well locations within the 
patterns. The core optimizer used is PSO, but the method could be implemented 
with other derivative-free optimization algorithms including GA. 

2.4.1.1 Well Pattern Description 

The basic PSO procedure was described in Section 12.2.3.21 In the well pattern de- 
scription (WPD), the optimization parameters define the target pattern. This pattern 
is then replicated over the entire domain, with wells that fall outside of the reser- 
voir eliminated. The algorithm considers four different well pattern types, as shown 
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(a) Inverted five-spot 



(b) Inverted six-spot 





(c) Inverted seven-spot 



(d) Inverted nine-spot 



Fig. 2.6 Illustration of the well patterns considered. The solid black circles represent produc- 
tion wells and the circles with arrows represent injection wells. The patterns are referred to 
as 'inverted' because the injection wells are at the centers of the patterns (from |21 1). 



in Figure 1X61 Optimization variables include the pattern type (categorical variable 
/*''), the location of one of the wells in the pattern (^j^,rjf), pattern dimensions 
{ai,bi), and parameters associated with a number of pattern operators, which we 
now describe. 

The patterns determined using the representation above will be quite regular and 
oriented with the x — y coordinate system. It may be advantageous, however, to 
adjust the orientation of the pattern to better accommodate the reservoir shape or the 
spatial variation/correlation of rock properties such as permeability. To accomplish 
this, several different pattern operators were introduced in 1:211 . These include a 
rotation operator, a shear operator and a scale operator. Well locations for the target 
pattern, after application of these operators, can be expressed as: 



W = M W 



(2.6) 



where Wom and W,„ are Nn-p x 2 (relative) well location matrices, where Nn-p is the 
number of wells in the pattern, and M is a 2 x 2 transformation matrix, defined for 
each operator. For example, for the rotation operator, we have: 



Mfl 



cos 6 sin 9 
— sin0 COS0 



(2.7) 
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where 9 designates the angle of rotation. M matrices are also defined for shear 
and scale operators; see f2T| for details. A fourth operator, referred to as 'switch,' 
which acts to convert all injection wells to production wells and vice versa, was also 
introduced. This operator changes the target pattern from the so-called 'normal' 
form to the 'inverted' form (or back). 

The full set of optimization variables for the well pattern description, for PSO 
particle /, is given by (with the iteration index k omitted for clarity): 

X,- = [{/;"", [4°, r]f ,«,>,]} {5,1,5,2, . . . ,5,v.4-J {^,-,1, ^,2, . . . , ^,^J]. (2.8) 

^^ V "■ v '^ -V ' 

pattern parameters operator sequence pattern operators 

Here {/,"'\ [i^/',T]?,a,-,ii,]} are the basic pattern parameters for particle ;, ^, is the 
number of pattern operators, {i^i,i, i^;,2,--- ,^(,^} are the parameters associated 
with the pattern operators, and {5,_i,5,2, . . . ,5,- />;,} defines the sequence in which 
the operators are applied. The total number of optimization variables depends on 
the number and type of operators included, but it is only around 25 when all of the 
operators noted above are used. All components of x, are treated as real numbers in 
the optimization. Some of these parameters (e.g., if' and Sij) are, however, integers. 
Where necessary, integer values are determined from real values by simply rounding 
to the nearest integer. 

2.4.1.2 Second-Stage Optimization 

Following the determination of the optimum repeated pattern using the well pattern 
description (WPD) approach described above, a second-stage optimization can be 
applied to further improve the solution. This procedure is based on a well-by-well 
perturbation (WW?) and involves the local shifting of wells within patterns. Opti- 
mization variables (PSO particles) for WWP optimization are: 



Xi = {A^u Aril , 4(^2, Ari2 , ... , AB,j, Ar\j , ... , AB,n,,, Ar]^,,}, (2.9) 




well 1 well 2 well / well N. 



where Nw is the number of wells determined in the first-stage (WPD) optimiza- 
tion and AE,j and Ar\j are the perturbations of the spatial locations of well j. The 
minimum and maximum values of At,j and Ar\j are constrained to keep wells es- 
sentially within their original patterns. The dimension of this optimization problem 
can be high for large A^,,,, but the size of the search space is greatly limited by bound 
constraints on AE,j and Ar\j. We note finally that this second-stage optimization 
could be extended to determine completion intervals (i.e., vertical locations where 
the well is open to flow), to eliminate particular wells, or to modify individual well 
types. 
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2.4.2 Field Development Optimization Example 



We now apply the procedures described above to a two-dimensional reservoir 
model. This example is taken from [21 1; refer to that paper for full details. The 
reservoir domain is irregular, as shown in Figure 12771 where the dark regions along 
the boundaries designate non-reservoir zones. Wells that fall outside of the reser- 
voir region are eliminated from the set. The model contains a total of 80 x 132 
grid blocks. The production and injection wells are prescribed to operate at fixed 
bottom-hole pressures of 1200 psi and 2900 psi, respectively. The total production 
time is 1825 days. Flow simulations for this case were performed using the stream- 
line simulator 3DSL 1381 . Streamline simulators are not as broadly applicable as 
standard finite-volume based simulators, but when appropriate, as they are in many 
waterflood simulations, streamline approaches can be considerably more efficient 
than standard procedures. 

The well pattern optimization runs used 40 PSO particles and proceeded for 
40 iterations. The optimization was run five times. Following these five runs, 
the best optimization solution (run 3 in Table 12.21 ) was used for five subsequent 
WWP optimizations. Results for NPV for the well pattern optimizations are shown 
in Table 12.21 while those from the subsequent use of WWP are presented in 
Table 12.31 It is evident from Table l2.2l that the inverted five-spot was the best pattern 
in all runs. We see from Table |231 that WWP consistently led to improvements of 
around 20% over the unperturbed patterns. The progress of the overall optimization 
is displayed in Figure 12.81 where the improvement in NPV during both stages is 
evident. 




Fig. 2.7 Logarithm of permeability field for field development optimization example (from 

EU). 
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Table 2.2 Optimization results using well pattern description (from f2Tl) 





Best pattern 


NPV 

(SMM) 


Well count 






Producers 


Injectors 


1 


inv. 5 -spot 


1377 


16 


15 


2 


inv. 5 -spot 


1459 


15 


15 


3 


inv. 5 -spot 


1460 


15 


15 


4 


inv. 5 -spot 


1372 


15 


15 


5 


inv. 5 -spot 


1342 


13 


15 


Average 




1402 







Table 2.3 Optimization results using the second-stage procedure relative to run 3 (from II21I ) 



Run 

2 
3 
4 

5 



Average 



NPV 


Increase over 


well pattern description 


(SMM) 


(SMM) 


% 


1777 


317 


21.7 


1787 


327 


22.4 


1776 


316 


21.6 


1801 


341 


23.4 


1771 


311 


21.3 



1782 



322 
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Fig. 2.8 NPV of best result from well pattern description (WPD), and average NPV of the 
best second-stage well-by-well perturbation (WWP) solutions, versus number of simulations 
(from 1211 ). 



Figures B.Qf a) and (b) show the optimal well locations from both stages of the 
optimization. Repeated five-spot patterns are evident in both figures. It is interesting 
to observe that, although the differences in well locations between the two figures 
are relatively slight, these perturbations result in an improvement in NPV of 23%. 
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40 60 

(a) WPD 




40 60 

(b) WWP 

Fig. 2.9 Well locations for the best well pattern description (WPD) and well-by-well pertur- 
bation (WWP) solutions (circles indicate production wells, crosses indicate injection wells). 
Logarithm of permeability field is shown as background (from 1211 ). 



We note finally that several other examples demonstrating the use of PSO for 
well placement optimization were presented in 11201 |2T|| . In the examples in Ii20l 
the number of wells was always specified, though in some cases the well type 
was also optimized (e.g., deviated and branched wells were considered in some 
cases). Comparisons to optimizations using a genetic algorithm (GA) were pre- 
sented and, as noted above, PSO was shown to consistently outperform the GA 
considered. In one of the examples in f2T|, the well pattern optimization followed by 
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well-by-well perturbation was compared to an unconstrained optimization that used 
3Nk decision variables (the latter is referred to as the 'concatenation' approach). 
For this case the two-stage optimization consistently outperformed the concatena- 
tion approach. Taken in total, the results in |20,21 1 display the applicability of PSO 
for well placement optimization problems, as well as the potential advantages of the 
well pattern description and the two-stage optimization procedure. 

It will clearly be useful to combine the well control optimization described in 
Section 123] with the field development optimization considered here. This coupled 
optimization problem will be computationally demanding, but the solutions pro- 
vided can be expected to outperform those determined through the sequential appli- 
cation of the two procedures. Work along these lines is currently underway. 

2.5 Assimilation of Reservoir Data (Inverse Modeling) 

The reliability of oil production forecasts, and the 'optimal' strategy that is deter- 
mined based on these predictions, depend strongly on the proper calibration of the 
reservoir simulation model. In essence, this calibration aims at finding appropriate 
model parameters given a number of observations. The two model parameters that 
(in many cases) most directly impact reservoir flow are permeability and porosity. 
Both of these parameters vary spatially. For a given rock type, which is denoted as 
facies in this context, porosity and permeability are often correlated, and one can be 
estimated from the other. In this work, the calibration parameter is taken to be the 
facies in each grid block, and we assume that each facies corresponds to a particular 
permeability and porosity. 

Historic flow production represents one set of observed data. Such data are cru- 
cial because it is precisely the prediction of the reservoir flow response that is the 
ultimate purpose of the modeling. However, production data provides direct infor- 
mation only at well locations (though of course the flow rates and pressures observed 
at wells are impacted by reservoir properties outside the well region). In contrast to 
production data, seismic measurements (such as diffraction tomography) provide 
more global information and thus can be used to improve estimates of the spatial 
distribution of rock properties. Here we consider as observable data both flow and 
seismic measurements. 

The use of observational data to infer reservoir properties is an inverse problem. 
As such, we anticipate that the solution will be non-unique. This is typically the 
case because there are more parameters to estimate than there are independent mea- 
surements, so many combinations of parameters yield similar model responses. In 
addition to the underspecified nature of the problem, additional complications arise 
from the approximations used in the forward modeling and from the presence of 
noise in the data. Uncertainty quantification/assessment involves finding multiple 
solutions of the inverse problem in order to generate a collection of production fore- 
casts. For more information on data assimilation under uncertainty in this context, 
refer to. e.g.. Il73ll45ll74ll. 
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2.5.1 Problem Statement 

The solutions of a geophysical inverse problem are the set of geological models that, 
when forward-modeled to provide simulation data, match the observations to within 
some tolerance. Since the approach here, as shown below, involves formulating the 
data assimilation process in optimization terms, any model configuration (set of in- 
version parameters) will be denoted by x e f2 C K", and Q is the set of admissible 
models. The admissibility criteria can be formulated with respect to geological con- 
sistency. Geological consistency typically implies a particular spatial correlation of 
parameters (e.g., a given spatial covariance). The model x in this work represents 
the facies type associated with every grid block. Thus, the number of optimization 
variables n is on the order of the number of grid blocks in the discretized reservoir 
model (which can be very large in practical models). 

From an optimization perspective, the inverse problem can be stated as follows 

min||0(x)-y||2, (2.10) 

xei2 

where y G R™ are the observations and 0(x) e R™ represent the numerically- 
simulated observations. All the observable data considered are concatenated in y 
and O (x). Thus, if Oi (x) e R"' and O2 (x) e M'"^ are the two sets of observable 
data considered, then O (x) ~ [Oi (x) , O2 (x)], with mi+m2^ m. In the norm (Eu- 
clidean in this work), we can account for data uncertainty and include weights for 
the different sets of data. Since the observable data in this work are normalized, 
weights are taken to be unity. We reiterate that there are typically a much larger 
number of inversion parameters than there are independent measurements (« ^ m), 
and therefore the optimization problem in (12.101 1 is frequently ill-conditioned. 

2.5.2 Methodologies for Data Assimilation 

The optimization problem in (12.10b presents a number of challenges in addition to 
ill-conditioning. The cost function requires costly simulations, and in many cases 
derivative information is expensive to obtain or not available. The number of opti- 
mization variables is often large and the objective function can be non-smooth due 
to, for example, the presence of noise in the observations. These difficulties can be 
addressed by means of the following strategies. 

The integration of disparate data in reservoir modeling has been suggested in a 
number of publications (e.g., Il751l76ir77]| ) as a means to alleviate the ill-conditioned 
character of ( 12.101 ). In essence, the use of different data types provides a degree 
of regularization for the inverse problem. Here, as in |32], we use as observable 
data oil and water production rates and diffraction tomography data. These data sets 
are complementary since they measure system responses on different spatial and 
temporal scales. 

We can also expect a better conditioned optimization problem if the number of 
parameters is decreased. Instead of searching in n dimensions, we consider a sub- 
space of dimension hr. This subspace selection is not arbitrary and essentially aims 
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at reducing the correlation between inversion parameters. The parameter reduction 
used here is based on principal component analysis (PC A), or the Karhunen-Loeve 
transform, and can also be interpreted from a data compression perspective. The 
statistical information needed is generally obtained from a prior (rough) knowledge 
of the reservoir properties, and provides the inversion with geological consistency. 
We essentially follow the PCA-based parameter reduction technique used in |[6l, 
though that approach considers only flow production data and is invasive with re- 
spect to the flow simulator (thus it is very efficient but requires source-code access 
to implement). 

In the example in Section 12.5.31 both production and seismic measurements 
provide the observable data. We reduce the number of optimization variables and 
introduce geological consistency through principal component analysis, and we ap- 
proach (12.10b by derivative-free local optimization with an initial guess selected by 
a heuristic procedure that is based on information obtained by PCA. The use of 
numerical derivatives and a global procedure (GA) are also considered for com- 
parison. All of these black-box approaches are more demanding computationally 
than an invasive adjoint-based gradient procedure but, as mentioned above, can 
be significantly accelerated through distributed computing. We briefly present be- 
low the fundamentals of PCA, since that transformation is a key component of our 
methodology. 

2.5.2.1 Parameter Reduction Using Principal Component Analysis 

Principal component analysis (PCA) optimally selects a subspace of dimension hr 
from a larger space of dimension n. Given A^ possible models sampled from Q, 
the region of the search space where plausible optimal solutions are expected, 
{x^-l^j C i2 C M", PCA seeks an affine transformation 



with /i G M" and the set {s,}"f j C M" orthonormal. We note that this transforma- 
tion is essentially an orthogonal projection. PCA is optimal in the sense that the 
Euclidean reconstruction error ||X|(. — X|(.||2, averaged over {x^l^^j, is minimized (or, 
equivalently, that the average reconstruction energy is maximized). 

The optimal solution ifTSll implies that jj. is the average of the N models sam- 
pled {xil^^j, and that each s, is an eigenvector for the covariance matrix associated 
with these models. Additionally, it can be seen that the covariance matrix for the hr 
PCA coefficients for {x/i}^^[ is a diagonal matrix, and that the contribution to the 
average reconstruction error from each of the PCA basis components s, is equal to 
the corresponding eigenvalue. 

The selection of the A'^ models {x^j^^j is crucial and is done based on prior 
information. If these models provide an acceptable representation of Q, a large part 
of the «K-dimensional search space will provide solutions that are (in this case, 
geologically) consistent. Therefore, PCA not only reduces the search space, but also 
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helps to ensure that the solutions obtained are practically acceptable. The value hr 
is typically much smaller than n. Low values of hr yield low-dimensional search 
spaces that are easier to explore, but the reconstruction error can be unacceptably 
large. In other words, the optimal search would take place only in a small part of Q, 
and thus the solutions obtained in that reduced space may be clearly suboptimal. The 
determination of the appropriate value for riR is application specific and is typically 
done through numerical experimentation. 

A ranking for the PCA components can be established based on their respective 
eigenvalues - the higher the eigenvalue, the higher the rank (and thus the impor- 
tance) of the associated PCA basis vector. This, together with the fact that the co- 
variance matrix for the PCA coefficients is diagonal, suggests that a sequence of 
one-dimensional optimizations aimed at computing coefficients for the highest-rank 
PCA basis vectors may be beneficial in the overall optimization. Based on this ob- 
servation, a heuristic PCA-based procedure for computing the initial guess in ( 12.10b 
can be obtained (please consult ll32ll for details). 

2.5.3 Data Assimilation Example 

The case study is taken from f3T| and is based on a ten-layer synthetic model (with 
20 X 20 X 10 = 4000 cells) extracted from the Stanford VI reservoir model 117911 . 
This approach provides a good framework for comparing inversion methodologies 
since the true model is known. We simulate a five-spot well pattern (four injectors 
in the corners, and one producer in the center of the domain; see Figure [rO(a)| . 
The optimization variable x is a binary facies indicator in every grid block (desig- 
nating the block as either sand or shale). Though this variable is binary valued, it 
can be relaxed to a continuous variable. Thus, a value of 0.5 indicates that in the 
corresponding grid block, sand and shale are distributed equally. 

The observable production data consists of the total field cumulative oil produc- 
tion and water injection, obtained at intervals of ten days up to 90 days (therefore, 
mi = 10+10 = 20). The production data are computed by solving the (discretized) 
reservoir flow equations. Here we use Stanford's general purpose research simula- 
tor (GPRS; 136,371 ). The permeability and porosity fields are functions of the facies 
parameter x. Given a (real-valued) facies parameter for grid block /, designated x,, 
we compute the associated porosity (/), by the following expression 

<^i{xi) = 0oexp(x,Tn(0i/(^o)), 

where the coefficients ^q and <p\ are the porosity values associated with the shale 
and sand facies (in practice, these values can be determined through measurements 
on rock cores or regression). We relate the block permeability kt to the porosity (/), 
using the Kozeny-Carman equation (see, e.g., lISOl ) 

with the parameter a calculated from measurements or regression. 
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Fig. 2.10 Layer 4 from (a) the true model studied in Section 1231 (injection and production 
wells are indicated as blue and red circles, respectively), (b) corresponding reconstruction 
after PCA with A' = 1000 realizations and ;is = 30, (c) model selected randomly from the set 
of N = 1000 realizations, and (d) corresponding reconstruction after PCA with rig = 30. Red 
and blue represent sand and shale facies, respectively. The original fades model is binary- 
valued, but after PCA it becomes continuous (from f321); see online version for colors. 



The second set of observable data is derived from crosswell diffraction tomogra- 
phy. In crosswell tomography, sound wave sources are placed in one (usually ver- 
tical) well and recorded and placed in another well (typically some hundred meters 
away). By recording the waves propagating from one well to another, it is possible 
to reconstruct approximately the structure of the earth in between the wells. The 
estimated earth image is sometimes called a crosswell section. In this example we 
have two crosswell sections obtained by associating diagonally the injectors in the 
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five-spot pattern in Figure [TO(a)| Each section involves the ten layers in the model 
and is discretized by a 20 x 20 matrix of velocities (hence, OT2 — 400 + 400 = 800). 
The tomographic data along these two perpendicular crosswell sections are com- 
puted only once, after 90 days. The seismic observable data depends on certain rock 
properties (elastic bulk modulus and density) which in turn are functions of the fluid 
saturations at each grid block 1 801 ■ The input for the seismic tomography simula- 
tor thus includes the model x, which provides the porosity for each grid block, and 
fluid saturations. These quantities, together with rock physics models, are used to 
compute the elastic velocities IHOl . In all tomography calculations, both for the ob- 
servations and during optimization, a simplified geometry for the top of the reservoir 
is considered, and the associated corrections are not included. 

A priori knowledge of the reservoir geology, in the form of a so-called training 
image fSl'l, together with facies data obtained at the well locations, allow the gener- 
ation ofN— 1000 geologically consistent model realizations, all conditioned to the 
prior information. These models are generated using a multipoint geostatistical al- 
gorithm [81 1, which can represent complex spatial structures. Through application 
of PC A to these 1000 realizations we reduce the number of inversion parameters 
from n = 4000 to hr = 30. In Figure ITTOl we show two of these models (one of the 
ten model layers is shown) and their corresponding reconstructions. For our appli- 
cation, these PCA reconstructions are acceptable. 

2.5.3.1 Inversion Results and Prediction 

We compare here sequential quadratic programming (SQP) using numerical gradi- 
ents with generalized pattern search (GPS), Hooke- Jeeves direct search (HJDS), and 
a genetic algorithm (GA). The initial guess for the local optimizers is computed as 
outlined above (see ll32l for details). The GA population is 60 individuals and the 
algorithm is run for 100 generations. The initial population in the GA does not con- 
tain the initial guess taken for the local optimizers. In this way, we can test if GA can 
be beneficial in cases when useful initial guesses are not available. The distributed 
computing environment consists of a cluster with 48 nodes, and it is used for the 
SQP, GPS and GA optimizations. Each observable data value is assigned random 
noise with an amplitude of 5% of the standard deviation of the corresponding data 

type. 

The models determined through inversion are shown in Figure lZTT] These results 
are for the same layer as shown in Figure l2T0l though they are generally represen- 
tative for all ten layers in the model. As noted earlier, after PCA the original binary 
facies model is continuous (it could be transformed back to binary values using 
thresholding if necessary). It is evident that all of the methods provide reasonable 
models. A carefully selected initial guess is crucial for obtaining acceptable inver- 
sion results with the SQP, GPS and HJDS methods. Our process for determining the 
initial guess relies on some heuristics and therefore is not fully general, though it 
appears adequate for this case. The GA result appears slightly less accurate than the 
others, though the main model features are captured. 
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Fig. 2.11 Inverse model results for layer 4 of the reservoir section studied in Section |23] 
Facies distribution obtained by (a) sequential quadratic programming, (b) generalized pattern 
search, (c) Hooke-Jeeves direct search, and (d) a genetic algorithm. The genetic algorithm, 
because of its global nature, does not require an initial guess. The true distribution for layer 4 
is shown in Figure pl)(a)| Red and blue represent sand and shale facies, respectively. Though 
the original facies model is binary-valued, after PCA it becomes continuous (from |32|); see 
online version for colors. 



Figure inSl illustrates the performance of the local optimizers used for this prob- 
lem. In this plot the horizontal axis is the number of equivalent simulations, which 
is defined as the total number of simulations divided by the speedup obtained by 
the parallel implementation. The concept of equivalent simulation is used to en- 
able comparisons, in terms of elapsed time (not total computation time), between 
HJDS and the other (parallel) procedures. Since HJDS is inherently serial, for that 
algorithm the number of equivalent simulations coincides with the total number of 
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Fig. 2.12 Performance results for the local optimizers studied in the model inversion in Sec- 
tion |23] (from li32J ); see online version for colors. 



simulations. Note that one simulation involves calls to both the flow and seismic to- 
mography simulators and that the initial guess computation for the local optimizers 
requires roughly five equivalent simulations. It is evident from Figure |2?T2|that SQP 
provides the most efficient performance for this case. However, we expect that SQP 
performance would degrade if the cost function was less smooth. If the comparison 
was made in terms of total computation time, HJDS would be the most efficient al- 
gorithm for this problem (HJDS would thus be the method of choice in the absence 
of distributed computing resources). 

The best individual in the initial GA population had a cost function of 0.036. 
After around 200 equivalent function evaluations, the objective function for GA de- 
creased to about 0.006, though more gradually than for the other methods shown in 
Figure ITT2I This performance is promising since the GA was run without providing 
any initial guess as input. If a larger population is used, GA can explore the global 
search space and, as a consequence, potentially identify multiple solutions that are 
comparable in terms of the cost function. These solutions could then be used for 
uncertainty assessment. 

The oil production and water injection forecasts over 360 days, for the model 
obtained using SQP, are shown in Figure [T3(a)| (we note that the inversion involved 
data over only the first 90 days). Agreement is generally very close, though slight 
mismatches are evident at later times, and these mismatches grow with time. In or- 
der to achieve accuracy over long simulation periods (up to 2000 days), the solution 
determined by SQP was adjusted as follows. A new data assimilation was performed 
after the first 1000 days. The observable data considered were the cumulative pro- 
duction of oil and water, together with two new crosswell tomographies at the end 
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Fig. 2.13 (a) Oil production and water injection forecast (360 days) for the solution obtained 
by SQP. (b) Oil and water production forecast (2000 days) for the solution recalibrated after 
1000 days. In both cases the noise in the observations has been removed (from |32|); see 
online version for colors. 



of the interval. The calibration at 1000 days started with the previously determined 
model (as shown in Figure ITTTb and it involves only one additional parameter (A). 
This parameter simply scales globally the facies distribution; i.e., the new model is 
given by Ax, where x is the (old) model obtained using data for the first 90 days. 
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The attendant one-dimensional optimization problem in A required approximately 
two additional equivalent simulations. 

Figure [13 (b)| shows predictions from the new model for oil and water production 
over 2000 days. The model provides accurate predictions over the entire period. 
This type of recalibration can be done in practice whenever the deviation between 
the prediction and the corresponding data is larger than some acceptable tolerance. 
Since this new calibration is performed using a solution calculated previously, the 
number of parameters considered can be relatively small. Alternative approaches, in 
which more parameters (or the entire model) are computed, could also be applied. 



2.6 Concluding Remarks 

In this chapter we have applied derivative-free optimization methods to three dif- 
ferent problems relevant to oil field operations. The examples considered are repre- 
sentative of a wide range of practical simulation-based optimization problems and 
involve oil production optimization with general operating constraints, field devel- 
opment using a well pattern description, and data assimilation based on flow and 
seismic measurements. These problems involved continuous, integer and categori- 
cal variables, and the search spaces contained at most 100 dimensions. The success- 
ful use of derivative-free methods for these problems clearly demonstrates that these 
algorithms are viable for a range of oil field applications. 

The derivative-free algorithms studied include generalized pattern search, Hooke- 
Jeeves direct search, a genetic algorithm, and particle swarm optimization. In order 
to enable additional comparisons, we also tested a gradient-based method, sequen- 
tial quadratic programming, with derivatives estimated numerically. With the ex- 
ception of Hooke- Jeeves direct search, all of these procedures can be readily paral- 
lelized and as such benefit immensely when implemented in a distributed manner. 
When parallel computing resources are limited or nonexistent, Hooke-Jeeves direct 
search represents a promising serial derivative-free optimization strategy. 

The performance of derivative-free approaches depends strongly on the dimen- 
sion of the search space, and for the computational resources typically available, 
these approaches are applicable when the number of optimization variables is on 
the order of a few hundred or less. Therefore, it may be necessary in some occa- 
sions to combine these approaches with some type of parameter reduction strategy. 
In this work, in one case we limited the size of the search space by restricting wells 
to be located within patterns, while in another case we applied principal component 
analysis to reduce the number of inversion parameters. 

There are still a number of challenges related to the problems considered in this 
chapter. Though categorical (decision) variables were included in the optimal field 
development example presented in Section 12.41 a comprehensive study on the use 
and limitations of derivative-free algorithms for this type of mixed-integer nonlinear 
optimization problem would be of great interest. In addition, further comparisons 
between local and global methods, and the development of hybrid procedures, will 
also be useful. It will be beneficial to jointly address field development optimization 
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and well control optimization, as the optimal well locations will in general depend 
on how the wells are operated. Multi-objective optimization may be of interest for 
this and other applications. 

The efficient treatment of uncertainty in all of the problems considered is also a 
topic of great importance. Data assimilation methodologies that generate multiple 
solutions consistent with observed data are required. Optimization techniques that 
can efficiently handle multiple models are also needed. Finally, because the forward 
simulations required for our optimization methods are themselves often very time- 
consuming, the development of fast and reliable surrogate models will be of great 
use. Research in many of these areas is currently underway. 

Acknowledgements. We are grateful to the industry sponsors of the Stanford Smart Fields 
Consortium and the Stanford Center for Reservoir Forecasting for partial funding of this 
work, and to the Stanford Center for Computational Earth and Environmental Science for 
providing distributed computing resources. We also thank Obiajulu J. Isebor (Stanford Uni- 
versity), Jerome E. Onwunalu (now at BP) and Eduardo T. F. Santos (now at CEFET-BA) for 
their contributions to this work. 



References 

1. Jansen, J.D., Brouwer, D.R., Naevdal, G., van Kruijsdijk, C.RJ.W.: Closed-loop reser- 
voir management. First Brea 23, 43^8 (2005) 

2. Pironneau, O.: On optimum design in fluid mechanics. J. Fluid Mech 64, 97-1 10 (1974) 

3. Ramirez, W.F.: Application of Optimal Control Theory to Enhanced Oil Recovery. 
Elsevier, Amsterdam (1987) 

4. Brouwer, D.R., Jansen, J.D.: Dynamic optimization of waterflooding with smart wells 
using optimal control theory. SPE Journal 9(4), 391^02 (2004) 

5. Zandvliet, M., Handels, M., van Essen, G., Brouwer, R., Jansen, J.D.: Adjoint-based 
well-placement optimization under production constraints. SPE Journal 13(4), 392-399 
(2008) 

6. Sarma, P., Durlofsky, L.J., Aziz, K., Chen, W.H.: Efficient real-time reservoir man- 
agement using adjoint-based optimal control and model updating. Computational Geo- 
sciences 10, 3-36 (2006) 

7. Conn, A.R., Scheinberg, K., Vicente, L.N.: Introduction to Derivative-Free Optimiza- 
tion. MPSSeries on Optimization, MPS-SIAM (2009) 

8. Kolda, T.G., Lewis, R.M., Torczon, V.: Optimization by direct search: new perspectives 
on some classical and modern methods. SIAM Review 45(3), 385^82 (2003) 

9. Wright, M.H.: Direct Search methods: once scorned, now respectable. In: Griffiths, 
D.F., Watson, G.A. (eds.) Numerical Analysis 1995 (Proceedings of the 1995 Dundee 
Biennial Conference in Numerical Analysis). Pitman Research Notes in Mathematical 
Series, pp. 191-208. CRC Press, Boca Raton (1995) 

10. Meza, J.C., Martinez, M.L.: On the use of direct search methods for the molecular 
conformation problem. Journal of Computational Chemistry 15, 627-632 (1994) 

11. Booker, A.J., Dennis Jr., J.E., Frank, P.D., Moore, D.W., Serafini, D.B.: Optimization 
using surrogate objectives on a helicopter test example. In: Borggaard, J.T, Burns, J., 
Cliff, E., Schreck, S. (eds.) Computational Methods for Optimal Design and Control, 
pp. 49-58. Birkhauser, Basel (1998) 



52 D.E. Ciaurri, T. Mukerji, and L.J. Durlofsky 

12. Marsden, A.L., Wang, M., Dennis Jr., J.E., Moin, P.: Trailing-edge noise reduction us- 
ing derivative-free optimization and large-eddy simulation. Journal of Fluid Mechan- 
ics 572, 13-36 (2003) 

13. Duvigneau, R., Visonneau, M.: Hydrodynamic design using a derivative-free method. 
Structural and Multidisciplinary Optimization 28, 195-205 (2004) 

14. Fowler, K.R., Reese, J.R, Kees, C.E., Dennis Jr., J.E., Kelley, C.T., Miller, C.T., Audet, 
C, Booker, A.J., Couture, G., Darwin, R.W., Farthing, M.W., Finkel, D.E., Gablonsky, 
J.M., Gray, G., Kolda, T.G.: Comparison of derivative-free optimization methods for 
groundwater supply and hydraulic capture community problems. Advances in Water 
Resources 31(5), 743-757 (2008) 

15. Oeuvray, R., Bierlaire, M.: A new derivative-free algorithm for the medical image regis- 
tration problem. International Journal of Modelling and Simulation 27, 1 15-124 (2007) 

16. Marsden, A.L., Feinstein, J.A., Taylor, C.A.: A computational framework for derivative- 
free optimization of cardiovascular geometries. Computational Methods in Applied 
Mechanics and Engineering 197, 1890-1905 (2008) 

17. Cullick, A.S., Heath, D., Narayanan, K., April, J., Kelly, J.: Optimizing multiple-field 
scheduling and production strategy with reduced risk. SPE paper 84239 presented at 
the 2009 SPE Annual Technical Conference and Exhibition, Denver, Colorado, October 
5-8 (2009) 

18. Velez-Langs, O.: Genetic algorithms in oil industry: an overview. Journal of Petroleum 
Science and Engineering 47, 15-22 (2005) 

19. Artus, v., Durlofsky, L.J., Onwunalu, J., Aziz, K.: Optimization of nonconventional 
wells under uncertainty using statistical proxies. Computational Geosciences 10, 
389-404 (2006) 

20. Onwunalu, J., Durlofsky, L.J.: Application of a particle swarm optimization algo- 
rithm for determining optimum well location and type. Computational Geosciences 14, 
183-198 (2010) 

21. Onwunalu, J., Durlofsky, L.J.: A new well pattern optimization procedure for large- 
scale field development. SPE Journal (in press) 

22. Bittencourt, A.: Optimizing Hydrocarbon Field Development Using a Genetic Algo- 
rithm Based Approach. PhD thesis. Dept. of Petroleum Engineering, Stanford Univer- 
sity (1997) 

23. Bittencourt, A.C., Home, R.N.: Reservoir development and design optimization. SPE 
paper 38895 presented at the 1997 SPE Annual Technical Conference and Exhibition, 
San Antonio, Texas, October 5-8 (1997) 

24. Yeten, B., Durlofsky, L.J., Aziz, K.: Optimization of nonconventional well type, loca- 
tion and trajectory. SPE Journal 8(3), 200-210 (2003) 

25. Harding, T.J., Radcliffe, N.J., King, PR.: Optimization of production strategies using 
stochastic search methods. SPE paper 35518 presented at the 1 996 European 3-D Reser- 
voir Modeling Confe rence, Stavanger, Norway, April 16-17 (1996) 

26. Almeida, L.F., Tupac, Y.J., Lazo Lazo, J.G., Pacheco, M.A., Vellasco, M.M.B.R.: 
Evolutionary optimization of smart-wells control under technical uncertainties. SPE 
paper 107872 presented at the, Latin American & Caribbean Petroleum Engineering 
Conference, Buenos Aires, Argentina, April 15-18 (2007) 

27. Carroll III, J.A.: Multivariate production systems optimization. Master's thesis, Dept. 
of Petroleum Engineering, Stanford University (1990) 

28. Echeverria Ciaurri, D., Isebor, O.J., Durlofsky, L.J.: Application of derivative-free 
methodologies for generally constrained oil production optimization problems. Inter- 
national Journal of Mathematical Modelling and Numerical Optimisation (in press) 



2 Derivative-Free Optimization for Oil Field Operations 53 

29. Schulze-Riegert, R.W., Axmann, J.K., Haase, O., Rian, D.T., You, Y.-L.: Evolutionary 
algorithms applied to history matching of complex reservoirs. SPE Reservoir Evalua- 
tion & Engineering 5(2), 163-173 (2002) 

30. Ballester, RJ., Carter, J.N.: A parallel real-coded genetic algorithm for history match- 
ing and its application to a real petroleum reservoir. Journal of Petroleum Science and 
Engineering 59, 157-168 (2007) 

31. Maschio, C, Campane Vidal, A., Schiozer, D.J.: A framework to integrate history 
matching and geostatistical modeling using genetic algorithm and direct search meth- 
ods. Journal of Petroleum Science and Engineering 63, 34-42 (2008) 

32. Echeverria, D., Mukerji, T.: A robust scheme for spatio-temporal inverse modeling of 
oil reservoirs. In: Anderssen, R.S., Braddock, R.D., Newham, L.T.H. (eds.) Proceedings 
of the 18th World IMACS Congress and MODSIM 2009 International Congress on 
Modelling and Simulation, pp. 4206^212 (2009) 

33. Dadashpour, M., Echeverria Ciaurri, D., Mukerji, T., Kleppe, J., Landr0, M.: A 
derivative-free approach for the estimation of porosity and permeability using time- 
lapse seismic and production data. Journal of Geophysics and Engineering 7, 351-368 
(2010) 

34. Aziz, K., Settari, A.: Petroleum Reservoir Simulation. Kluwer Academic Publishers, 
Dordrecht (1979) 

35. Gerritsen, M.G., Durlofsky, L.J.: Modeling fluid flow in oil reservoirs. Annual Review 
of Fluid Mechanics 37, 211-238 (2005) 

36. Cao, H.: Development of Techniques for General Purpose Simulators. PhD thesis, Dept. 
of Petroleum Engineering, Stanford University (2002) 

37. Jiang, Y: Techniques for Modeling Complex Reservoirs and Advanced Wells. PhD 
thesis, Dept. of Energy Resources Engineering, Stanford University (2007) 

38. Streamsim Technologies Inc., 3DSL v2.30 User Manual (2006) 

39. Stewart, R.R.: Exploration Seismic Tomography: Fundamentals. Course Notes Series, 
Society of Exploration Geophysicists (1991) 

40. Devaney, A.J.: Geophysical diffraction tomography. IEEE Transactions on Geoscience 
and Remote Sensing 22(1), 3-13 (1984) 

41. Harris, J.M.: Diffraction tomography with arrays of discrete sources and receivers. 
IEEE Transactions on Geoscience and Remote Sensing 25(4), 448-455 (1987) 

42. Chernov, L.A.: Wave Propagation in a Random Medium. McGraw-Hill, New York 
(1960) 

43. Aki, K., Richards, P.: Quantitative Seismology. W.H. Freeman, New York (1980) 

44. van Essen, G.M., van den Hof, P.M.J., Jansen, J.D.: Hierarchical long-term and short- 
term production optimization. SPE Journal (in press) 

45. Oliver, D.S.: Multiple realizations of the permeability field from well test data. SPE 
Journal 1, 145-154 (1996) 

46. Sarma, P., Durlofsky, L.J., Aziz, K.: Kernel principal component analysis for effi- 
cient, differentiable parameterization of multipoint geostatistics. Mathematical Geo- 
sciences 40, 3-32 (2008) 

47. Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, Heidelberg 
(2006) 

48. Gill, P.E., Murray, W., Saunders, M.A.: SNOPT: an SQP algorithm for large-scale con- 
strained optimization. SIAM Review 47(1), 99-131 (2005) 

49. Torczon, V.: On the convergence of pattern search algorithms. SIAM Journal on Opti- 
mization 7(1), 1-25(1997) 

50. Audet, C, Dennis Jr., I.E.: Analysis of generalized pattern searches. SIAM Journal on 
Optimization 13(3), 889-903 (2002) 



54 D.E. Ciaurri, T. Mukerji, and L.J. Durlofsky 

51. Audet, C, Dennis Jr., J.E.: Mesh adaptive direct search algorithms for constrained op- 
timization. SIAM Journal on Optimization 17(1), 188-217 (2006) 

52. Hooke, R., Jeeves, T.A.: Direct search solution of numerical and statistical problems. 
Journal of the ACM 8(2), 212-229 (1961) 

53. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. 
Addison-Wesley, Reading (1989) 

54. Eberhart, R.C., Kennedy, J.: A new optimizer using particle swarm theory. In: Pro- 
ceedings of the Sixth International Symposium on Micromachine and Human Science, 
pp. 39-43 (1995) 

55. Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: Proceedings of IEEE In- 
ternational Joint Conference on Neural Networks, pp. 1942-1948 (1995) 

56. Clerc, M.: Particle Swarm Optimization. ISTE Ltd (2006) 

57. Shi, Y, Eberhart, R.C.: A modified particle swarm optimizer. In: Proceedings of the 
1998 IEEE International Conference on Evolutionary Computation, pp. 69-73 (1998) 

58. Helwig, S., Wanka, R.: Theoretical analysis of initial particle swarm behavior. In: 
Rudolph, G., Jansen, T, Lucas, S., Poloni, C, Beume, N. (eds.) PPSN 2008. LNCS, 
vol. 5199, pp. 889-898. Springer, Heidelberg (2008) 

59. Carlisle, A., Dozier, G.: An off-the-shelf PSO. In: Proceedings of the 2001 Workshop 
on Particle Swarm Optimization, pp. 1-6 (2001) 

60. Fletcher, R., Leyffer, S.: Nonlinear programming without a penalty function. Mathe- 
matical Programming 91, 239-269 (2000) 

61. Wachter, A., Biegler, T.: On the implementation of an interior-point filter line-search 
algorithm for large-scale nonlinear programming. Mathematical Programming 106, 
25-57 (2006) 

62. Audet, C, Dennis Jr., J.E.: A pattern search filter method for nonlinear programming 
without derivatives. SIAM Journal on Optimization 14(4), 980-1010 (2004) 

63. Abramson, M.A.: NOMADm version 4.6 User's Guide. Dept. of Mathematics and 
Statistics, Air Force Institute of Technology (2007) 

64. Christie, M.A., Blunt, M.J.: Tenth SPE comparative solution project: a comparison of 
upscaling techniques. SPE Reservoir Evaluation & Engineering 4, 308-317 (2001) 

65. Isebor, O.J.: Constrained production optimization with an emphasis on derivative-free 
methods. Master's thesis, Dept. of Energy Resources Engineering, Stanford University 
(2009) 

66. Franstrom, K.L., Litvak, M.L.: Automatic simulation algorithm for appraisal of fu- 
ture infill development potential of Prudhoe Bay. SPE paper 59374 presented at the, 
SPE/DOE Improved Oil Recovery Symposium, Tulsa, Oklahoma, April 3-5 (2000) 

67. Giiyagiiler, B., Home, R.N.: Uncertainty assessment of well placement optimization. 
SPE Reservoir Evaluation & Engineering 7(1), 24-32 (2004) 

68. Bangerth, W., Klie, H., Wheeler, M.F., Stoffa, PL., Sen, M.K.: On optimization algo- 
rithms for the reservoir oil well placement problem. Computational Geosciences 10, 
303-319 (2006) 

69. Litvak, M., Gane, B., Williams, G., Mansfield, M., Angert, P., Macdonald, C, Mc- 
Murray, L., Skinner, R., Walker, G.J.: Field development optimization technology. SPE 
paper 106426 presented at the 2007 SPE Reservoir Simulation Symposium, Houston, 
Texas, February 26-28 (2007) 

70. Tupac, Y.J., Faletti, L., Pacheco, M.A.C., Vellasco, M.M.B.R.: Evolutionary optimiza- 
tion of oil field development. SPE paper 107552 presented at the, SPE Digital Energy 
Conference and Exhibition, Houston, Texas, April 11-12 (2007) 

71. Spall, J.C.: An overview of the simultaneous perturbation method for efficient opti- 
mization. Johns Hopkins APL Technical Digest 19(4), 482^92 (1998) 



2 Derivative-Free Optimization for Oil Field Operations 55 

72. Mattot, L.S., Rabideau, A.J., Craig, J.R.: Pump-and-treat optimization using analytic 
element method flow models. Advances in Water Resources 29, 760-775 (2006) 

73. Tarantola, A.: Inverse Problem Theory and Methods for Model Parameter Estimation. 
SIAM, Philadelphia (2005) 

74. Caers, J., Hoffman, T: The probability perturbation method: a new look at Bayesian 
inverse modeling. Mathematical Geology 38(1), 81-100 (2006) 

75. Gosselin, O., van den Berg, S., Cominelli, A.: Integrated history matching of production 
and 4D seismic data. SPE paper 71599 presented at the 2001 SPE Annual Technical 
Conference and Exhibition, New Orleans, Louisiana, September-30 October-3 (2001) 

76. Waggoner, J.R., Cominelli, A., Seymour, R.H.: Improved reservoir modeling with time- 
lapse seismic in a Gulf of Mexico gas condensate reservoir. SPE paper 77514 pre- 
sented at the, SPE Annual Technical Conference and Exhibition, San Antonio, Texas, 
September-29 October-2 (2002) 

77. Aanonsen, S.I., Aavatsmark, I., Barkve, T, Cominelli, A., Gonard, R., Gosselin, O., 
Kolasinski, M., Reme, H.: Effect of scale dependent data correlations in an integrated 
history matching loop combining production data and 4D seismic data. SPE paper 
79665 presented at the, SPE Reservoir Simulation Symposium, Houston, Texas, Febru- 
ary 3-5 (2003) 

78. Miranda, A.A., Le Borgne, Y.A., Bontempi, G.: New routes for minimal approximation 
error to principal components. Neural Processing Letters 27(3), 197-207 (2008) 

79. Castro, S.: A Probabilistic Approach to Jointly Integrate 3D/4D Seismic, Production 
Data and Geological Information for Building Reservoir Models. PhD thesis, Dept. of 
Energy Resources Engineering, Stanford University (2007) 

80. Mavko, G., Mukerji, T, Dvorkin, J.: The Rock Physics Handbook, 2""* edn. Cambridge 
University Press, Cambridge (2009) 

8 1 . Strebelle, S . : Conditional simulation of complex geological structures using multi-point 
statistics. Mathematical Geology 34, 1-21 (2002) 



Chapter 3 

Simulation-Driven Design in Microwave 

Engineering: Application Case Studies 

Slawomir Koziel and Stanislav Ogurtsov 



Abstract. Application of surrogate-based optimization methods to simulation- 
driven microwave engineering design is demonstrated. It is essential for the con- 
sidered techniques that the optimization of the original high-fidelity EM-simulated 
model is replaced by the iterative optimization of its computationally cheap surro- 
gate. The surrogate is updated using available high-fidelity model data to maintain 
its prediction capability throughout the optimization process. The surrogate model 
is constructed from the low-fidelity model which — depending on a particular ap- 
plication case — can be either an equivalent circuit or a coarsely discretized full- 
wave electromagnetic model. Designs satisfying performance requirements are 
typically obtained at the cost of just a few evaluations of the high-fidelity model. 
Here, several surrogate-based design optimization techniques for the use in mi- 
crowave engineering are discussed. Applications of space mapping, simulation- 
based tuning, variable-fidelity optimization, as well as various response correction 
techniques are illustrated. Design examples include planar filters, antennas, and 
transmission line transitions structures. 

Keywords: computer-aided design (CAD), microwave design, simulation-driven 
optimization, electromagnetic (EM) simulation, surrogate-based optimization, 
space mapping, tuning, surrogate model, high-fidelity model, coarse model. 



3.1 Introduction 

In this chapter, first, we describe several simulation-driven design optimization 
methods exploiting physically-based surrogate models, which can be used to 
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design a variety of microwave structures and devices in a computationally effi- 
cient way. Second, we illustrate application of these surrogate-based optimization 
methods for design of microwave components. Examples include a variety of 
structures such as microstrip filters, ultrawide band (UWB) antenna, planar Yagi 
antenna, broadband antenna on multilayer substrate, low-loss transition from co- 
planar waveguide to microstrip and substrate integrated waveguide. All these de- 
sign problems are computationally expensive so that application of conventional 
simulation-driven techniques (e.g., gradient-based algorithms) is not practical or 
even unfeasible. It will be demonstrated that the surrogate based methods exploit- 
ing the physically-based low-fidelity models can generate satisfactory designs at 
the cost corresponding to a few high-fidelity electromagnetic (EM) simulations of 
the structure of interest. 



3.2 Surrogate-Based Design Optimization in Microwave 
Engineering 

Microwave design task can be formulated as a nonlinear minimization problem 

x' G argminU [R^ix)) (3-1) 

where Rfe R'" denotes the response vector of the device of interest, e.g., the mod- 
ulus of the transmission coefficient I52il evaluated at m different frequencies, t/ is a 
given scalar merit function, e.g., a minimax function with upper and lower specifica- 
tions [1]. Vectors is the optimal design to be determined. Normally, Rfis obtained 
through computationally expensive electromagnetic simulation. It is referred to as 
the high-fidelity or fine model. 

The conventional way of handling the design problem (3.1) is to employ the EM 
simulator directly within the optimization loop. This direct approach faces some 
fundamental difficulties. The most important one is the high computational cost of 
high-fidelity EM simulation which makes the optimization impractical. Another dif- 
ficulty is that the responses obtained through EM simulation typically have poor 
analytical properties. In particular, EM-based objective functions are inherently noi- 
sy. Additional problem for direct EM-based optimization is that the sensitivity in- 
formation may not be available or expensive to compute. Only recently, computa- 
tionally cheap adjoint sensitivities [2] started to become available in some major 
commercial EM simulation packages, although for frequency-domain solvers only 
[3], [4]. 

Computationally efficient simulation-driven design can be performed using sur- 
rogate models. Microwave design through surrogate-based optimization (SBO) 
[1], [5], [6] is the main focus of this chapter. The primary reason for using SBO 
approach in microwave engineering is to speed up the design process by shifting 
the optimization burden to an inexpensive yet reasonably accurate surrogate model 
of the device. 
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The generic SBO framework described here that the direct optimization of the 
computationally expensive EM-simulated high-fidelity model Rf is replaced by an 
iterative procedure [1], [6] 

;c<'""=argmint/(/?<"(A:)) (3.2) 

that generates a sequence of points (designs) jc*'' e Xp i = 0, 1, ..., being approxi- 
mate solutions to the original design problem (3.1). Each jc''"^'' is the optimal de- 
sign of the surrogate model /?/'': X/'' -^ /?", Z/'*c/?", / = 0, 1, ... . /?/'' is as- 
sumed to be a computationally cheap and sufficiently reliable representation of the 
fine model Rf, particularly in the neighborhood of the current design x*'*. Under 
these assumptions, the algorithm (3.2) is likely to produce a sequence of designs 
that quickly approach Xf . 

Typically, Rf is only evaluated once per iteration (at every new design jc*'"^'') for 
verification purposes and to obtain the data necessary to update the surrogate 
model. Since the surrogate model is computationally cheap, its optimization cost 
(cf. (2)) can usually be neglected and the total optimization cost is determined by 
the evaluation of Rf. The key point here is that the number of evaluations of Rf for 
a well performing surrogate-based algorithm is substantially smaller than for any 
direct optimization method (e.g., gradient-based one) [7]. 

In the remaining part of this section we characterize the surrogate models used 
in microwave engineering (Section 3.2.1) and present several techniques for com- 
putationally efficient simulation-driven design of microwave structures (Sections 
3.2.2 through 3.2.6). Discussion covers the following methods: space mapping [1], 
[7], simulation-based tuning [8], shape-preserving response prediction [9], vari- 
able-fidelity optimization [10], as well as optimization through adaptively adjusted 
design specifications [11]. 

3.2.1 Surrogate Models in Microwave Engineering 

There are a number of ways to create surrogate models of microwave and radio- 
frequency (RF) devices and structures. They can be classified into two groups: 
functional and physical surrogates. Functional models are constructed from sam- 
pled high-fidelity model data using suitable function approximation techniques 
(e.g., polynomial regression [5] or kriging [5]). Physical surrogates exploit fast but 
limited-accuracy models that are physically related to the original structure under 
consideration. 

Here, we focus on methods exploiting physical surrogates. Their primary ad- 
vantage is that they are typically able to ensure good accuracy and generalization 
capability while using only a few training data points [12]. Physical surrogates are 
based on underlying physically-based low-fidelity models of the structure of inter- 
est (denoted here as R^). Physically-based models describe the same physical phe- 
nomena as the high-fidelity model, however, in a simplified manner. In micro- 
wave engineering, the high-fidelity model describes behavior of the system in 
terms of the distributions of the electric and magnetic fields within (and. 
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sometimes in its surrounding) that are calculated by solving the corresponding set 
of Maxwell equations [13]. Furthermore, the system performance is expressed 
through certain characteristics related to its input/output ports (such as so-called 5- 
parameters [13]). All of these are obtained as a result of high-resolution electro- 
magnetic simulation where the structure under consideration is finely discretized. 
In this context, the physically-based low-fidelity model of the microwave device 
can be obtained through: (i) analytical description of the structure using theory- 
based or semi-empirical formulas, (ii) different level of physical description of the 
system. The typical example in microwave engineering is equivalent circuit [1], 
where the device of interest is represented using lumped components (inductors, 
capacitors, microstrip line models, etc.), (iii) low-fidelity electromagnetic simula- 
tion. This approach allows us to use the same EM solver to evaluate both the high- 
and low-fidelity models; however, the latter is using much coarser simulation 
mesh which results in degraded accuracy but much shorter simulation time. The 
properties of the three groups of models are summarized in Table 3.1. 

Table 3.1 Physically-based low-fidelity models in microwave engineering 



Model Type 



CPU Cost 



Accuracy 



Availability 



Analytical 


Very cheap 


Low 


Rather limited 


Equivalent circuit 


Cheap 


Decent 


Limited (mostly filters) 


Coarsely-discretized 
EM simulation 


Expensive 


Good to very 
good 


Generic: available for all 
structures 



3.2.2 Space Mapping 

Space mapping (SM) [1], [7] is probably one of the most recognized SBO tech- 
niques using physically-based low-fidelity (or coarse) models in microwave engi- 
neering. SM exploits the algorithm (3.2) to generate a sequence of approximate 
solutions x'''\ i = 0, 1, 2, ..., to problem (3.1). The surrogate model at iteration i, 
Rs'\ is constructed from the low-fidelity model so that the misalignment between 
/?/'' and the fine model is minimized using so-called parameter extraction process, 
which is the nonlinear minimization problem by itself [1]. The surrogate is defined 
as [7] 



Rl'\x) = R^Jx,p^'^) 



(3.3) 



where /?, ^ is a generic space mapping surrogate model, i.e., the low-fidelity model 
composed with suitable transformations, whereas 



p<" =argminXU,>v,,. W Rfix''')-R,Jx''\p)\ 



(3.4) 



is a vector of model parameters and Wij. are weighting factors; a common choice 

of Wit: is Wij. = 1 for all / and all k. 
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Various space mapping surrogate models are available [1], [7]. They can be 
roughly categorized into four groups: (i) Models based on a (usually linear) distor- 
tion of coarse model parameter space, e.g., input space mapping of the form 
^s.g(x, p) = R.s.gix, B, c) = Rc{B-x + c) [1]; (ii) Models based on a distortion of the 
coarse model response, e.g., output space mapping of the form 
Rs.gix, p) = R.s.gix, d) = Rc{x) + d [7]; (iii) Implicit space mapping, where the pa- 
rameters used to align the surrogate with the fine model are separate from the de- 
sign variables, i.e., Rs,g{x, p) = Rs.g{x, Xp) = Re. i(x, Xp), with Rd being the coarse 
model dependent on both the design variables jc and so-called preassigned parame- 
ters Xp (e.g., dielectric constant, substrate height) that are normally fixed in the 
fine model but can be freely altered in the coarse model [30]; (iv) Custom models 
exploiting parameters characteristic to a given design problem; the most character- 
istic example is the so-called frequency space mapping 
^s.g(x, p) = Rs.gix, F) = Rc,f{x, F) [1], where Rc.f is a frequency-mapped coarse 
model, i.e., the coarse model evaluated at frequencies different from the original 
frequency sweep for the fine model, according to the mapping CO^ f\ +f2C0, with 

F=\fif2Y. 

A though discussion of various issues as well as generalizations of space map- 
ping can be found in the literature [12, 14, 15]. 

3.2.3 Simulation-Based Tuning and Tuning Space Mapping 

Tuning space mapping (TSM) [8] combines the concept of tuning, widely used in 
microwave engineering [16], [17], and space mapping. It is an iterative optimiza- 
tion procedure that assumes the existence of two surrogate models: both are less 
accurate but computationally much cheaper than the fine model. The first model is 
a so-called tuning model R, that contains relevant fine model data (typically a fine 
model response) at the current iteration point and tuning parameters (typically im- 
plemented through circuit elements inserted into tuning ports). The tunable pa- 
rameters are adjusted so that the model R, satisfies the design specifications. The 
second model, R^ is used for calibration purposes: it allows us to translate the 
change of the tuning parameters into relevant changes of the actual design vari- 
ables; Re is dependent on three sets of variables: design parameters, tuning pa- 
rameters (which are actually the same parameters as the ones used in R,), and SM 
parameters that are adjusted using the usual parameter extraction process [1] in 
order to have the model R^ meet certain matching conditions. Typically, the model 
Re is a standard SM surrogate (i.e., a coarse model composed with suitable trans- 
formations) enhanced by the same or corresponding tuning elements as the model 
R,. The conceptual illustrations of the fine model, the tuning model and the cali- 
bration model are shown in Fig. 3.1. 

The iteration of the TSM algorithm consists of two steps: optimization of the 
tuning model and a calibration procedure. First, the current tuning model /?/'' is 
built using fine model data at point x*''. In general, because the fine model with in- 
serted tuning ports is not identical to the original structure, the tuning model re- 
sponse may not agree with the response of the fine model at x*'' even if the values 
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Fig. 3.1 Conceptual illustrations of the fine model, the tuning model and the calibration 
model: (a) the fine model is typically based on full-wave simulation, (b) the tuning model 
exploits the fine model "image" (e.g., in the form of 5-parameters corresponding to the cur- 
rent design imported to the tuning model using suitable data components) and a number of 
circuit-theory-based tuning elements, (c) the calibration model is usually a circuit equiva- 
lent dependent on the same design variables as the fine model, the same tuning parameters 
as the tuning model and, additionally, a set of space mapping parameters used to align the 
calibration model with both the fine and the tuning model during the calibration process. 



of the tuning parameters Xt are zero, so that these values must be adjusted to, say, 
JC,o*'', in order to obtain ahgnment [8]: 



xl2=argrmn\\Rf(x'-')-Rl-\x,)\\ 



(3.5) 



In the next step, one optimizes /?/'' to have it meet the design specifications. Op- 
timal values of the tuning parameters jc, /'' are obtained as follows: 

x';l = SirgminU [rI'\x,)) (3-6) 

Having x,,\''\ the calibration procedure is performed to determine changes in the 
design variables that yield the same change in the calibration model response as 
that caused by jc,/'' -x,o'^ [8]. First one adjusts the SM parameters p*^'* of the cali- 
bration model to obtain a match with the fine model response at jc*'' 



)"'=argniin /?,(ji:"' )-/?,(*'", P,^'o)- 



(3.7) 



3 Simulation-Driven Design in Microwave Engineering: Application Case Studies 63 

The calibration model is then optimized with respect to the design variables in or- 
der to obtain the next iteration point jc*'"^'* 

x*'-"" =argmin||i?;" (*,"■; )-/?^,(Jc,y", *,"o)||- *^3.8) 

j: II ■ 11 

Note that x,o^'' is used in (3.7), which corresponds to the state of the tuning model 
after performing the alignment procedure (3.5), and jc,/'' in (3.8), which corre- 
sponds to the optimized tuning model (cf. (6)). Thus, (3.7) and (3.8) allow finding 
the change of design variable values jc''"^" -x^'' necessary to compensate the effect 
of changing the tuning parameters from x,.o^'' to x,. /'\ 

Thorough discussion of various variations of tuning space mapping algorithms, 
calibration procedures, as well as recent development in the TSM technology can 
be found in the Hterature [18, 19, 20]. 



3.2.4 Shape-Preserving Response Prediction 

Shape-preserving response prediction (SPRP) [9] is a response correction tech- 
nique that takes advantage of the similarity between responses of the high- and 
low-fidelity models in a very straightforward way. SPRP assumes that the change 
of the high-fidelity model response due to the adjustment of the design variables 
can be predicted using the actual changes of the low-fidelity model response. 
Therefore, it is critically important that the low-fidelity model is physically based, 
which ensures that the effect of the design parameter variations on the model re- 
sponse is similar for both models. In microwave engineering this property is likely 
to hold, particularly if the low-fidelity model is the coarsely-discretization struc- 
ture evaluated using the same EM solver as the one used to simulate the high- 
fidelity model. 

The change of the low-fidelity model response is described by the translation 
vectors corresponding to a certain (finite) number of characteristic points of the 
model's response. These translation vectors are subsequently used to predict 
the change of the high-fidelity model response with the actual response of Rf at the 
current iteration point, /?/(x'''), treated as a reference. 

Figure 3.2(a) shows the example low-fidelity model response, 15211 in the fre- 
quency range 8 GHz to 18 GHz, at the design jc''', as well as the low-fidelity model 
response at some other design jc. The responses come from the double folded stub 
bandstop filter example considered in [9]. Circles denote characteristic points of 
i?f(jc'''), selected here to represent 15211 = -3 dB, I52il = -20 dB, and the local I52il 
maximum (at about 13 GHz). Squares denote corresponding characteristic points 
for Rc(x), while line segments represent the translation vectors ("shift") of the cha- 
racteristic points of /?c when changing the design variables from jc''' to x. Since the 
low-fidelity model is physically based, the high-fidelity model response at the giv- 
en design, here, JC, can be predicted using the same translation vectors applied to 
the corresponding characteristic points of the high-fidelity model response at jc''', 
/?/(jc*''). This is illustrated in Fig. 3.2(b). Rigorous formulation of SPRP as well as 
generalizations of the basic algorithm can be found in [9]. 
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Fig. 3.2 SPRP concept: (a) Example low-fidelity model response at the design x''* 
(solid line), the low-fidelity model response at x, Rdx) (dotted line), characteristic points of 
Rc(x ) (circles) and Rdx) (squares), and the translation vectors (short lines); (b) High- 
fidelity model response at x*'', /f/x ) (soUd line) and the predicted high-fidelity model response 
atx (dotted line) obtained using SPRP based on characteristic points of Fig. 3.2(a); characteristic 
points of /f^x''^) (circles) and the translation vectors (short lines) were used to find the character- 
istic points (squares) of the predicted high-fidelity model response; low-fidelity model responses 
iJ^x*'^) and Rc(x) are plotted using thin solid and dotted Une, respectively [9]. 



3.2.5 Multi-fidelity Optimization Using Coarse-Discretization EM 
Models 



The most versatile type of physically-based low-fidelity model in microwave en- 
gineering is the one obtained through EM simulation of coarsely-discretized struc- 
ture of interest. The computational cost of the model and its accuracy can be easily 
controlled by changing the discretization density. This feature has been exploited 
in the multi-fidelity optimization algorithm introduced in [10]. 

The design optimization methodology of [10] is based on a family of coarse- 
discretization models {Rc.j},j= I,---, K, all evaluated by the same EM solver as 
the one used for the high-fidelity model. Discretization of the model Rc.j+\ is finer 
than that of the model /?,.,, which results in better accuracy but also longer evalua- 
tion time. In practice, the number of coarse-discretization models is two or three. 

Having the optimized design jc*^*^ of the last (and finest) coarse-discretization 



model /?f.jf, the model is evaluated at all perturbed designs around x , i.e 



[Xi 



(10 



Xk 



CO 



atx/^ = 



-I- sign(k)-dk ... x„*^Y, k = -n, -n+l, ..., n-l, n. A notation of/?* = 
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/?cjf(jc/^) is adopted here. This data can be used to refine the final design without di- 
rectly optimizing Rf. Instead, an approximation model involving /?**' is set up and 
optimized in the neighborhood of x*'^ defined as [jc**^ - d, x"*^^ + rf], where d = [di d2 
. . . d„Y. The size of the neighborhood can be selected based on sensitivity analysis of 
Re I (the cheapest of the coarse-discretization models); usually d equals 2 to 5 per- 
cent of JC*'^. 

Here, the approximation is performed using a reduced quadratic model ^(jc) = 
[qi q2 ... qmf, defined as 

a (x) = a .([x, ...X V)= /i-,,+X. ,x,+... + X- x + X. ^,x^ +... + /1-, x^ (3.9) 

Coefficients Ayr, J = 1, ..., m, r = 0, 1, ..., 2m, can be uniquely obtained by solving 
the linear regression problems 



1 X™ ••• x*"' fx"")^ 



1 X ' 



1 X ' 



(K) ,iK),l 
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j.ln 



RT 



R' 
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,(0) 



(3.10) 



where xj./'*^ is a y'th component of the vector Xk^\ and /?/*' is a yth component of 
the vector/?**', i.e.. 

In order to account for unavoidable misalignment between R^k and Rf, instead 
of optimizing the quadratic model q, it is recommended to optimize a corrected 
model q(x) + [Rf(x'^^^) - Rc,k(.x''^^)] that ensures a zero-order consistency [21] be- 
tween RcK and Rf. The refined design can be then found as 



:arg min U(q{x) + [Rfix''')-R,.Ax' ')]) 



(3.11) 



This kind of correction is also known as output space mapping [7]. If necessary, 
the step (4) can be performed a few times starting from a refined design, i.e.. 



X = argmin{jc* ' -d <x <x ' + d : U(g(x) + [Rf(x ) - Rc.k(x')])} (each iteration 
requires only one evaluation of Rf). 

The design optimization procedure can be summarized as follows (input argu- 
ments are: initial design jc*"' and the number of coarse-discretization models K): 



Set;=l; 

Optimize coarse-discretization model R^j to obtain a new design x*-'* using 

jc'^"'' as a starting point; 

Set; =;■ + 1; if; < /<:go to 2; 

Obtain a refined design jc as in (3. 13); 

END; 



Note that the original model Rf is only evaluated at the final stage (step 4) of the 
optimization process. Operation of the algorithm in illustrated in Fig. 3.3. Coarse- 
discretization models can be optimized using any available algorithm. 



66 S. Kozie and S. Ogurtsov 




Fig. 3.3 Operation of the multi-fidelity design optimization procedure for K = 3 (three coarse- 
discretization models). The design x''' is obtained as the optimal solution of the model R^j, 
j= 1,2, 3. A reduced second-order approximation model q is set up in the neighborhood of 
X* ' (gray area) and the final design x is obtained by optimizing a reduced q as in (3.13). 



3.2.6 Optimization Using Adaptively Adjusted Design 
Specifications 

The techniques described in Section 3.2.2 to 3.2.5 aimed at correcting the low- 
fidelity model so that it becomes, at least locally, an accurate representation of the 
high-fidelity model. An alternative way of exploiting low-fidelity models in simu- 
lation-driven design of microwave structures is to modify the design specifications 
in such a way that the updated specifications reflect the discrepancy between the 
models. This approach is extremely simple to implement because no changes of 
the low-fidelity model are necessary. 

The adaptively adjusted design specifications optimization procedure intro- 
duced in [11] consists of the following two simple steps that can be iterated if 
necessary: 

1 . Modify the original design specifications in order to take into account the 
difference between the responses of /?/ and R^ at their characteristic points. 

2. Obtain a new design by optimizing the coarse model with respect to the 
modified specifications. 

Characteristic points of the responses should correspond to the design specifica- 
tion levels. They should also include local maxima/minima of the respective re- 
sponses at which the specifications may not be satisfied. Figure 3.4(a) shows fine 
and coarse model response at the optimal design of R^, corresponding to the band- 
stop filter example considered in [11]; design specifications are indicated using 
horizontal lines. Figure 3.4(b) shows characteristic points of Rf and R^ for the 
bandstop filter example. The points correspond to -3 dB and -30 dB levels as well 
to the local maxima of the responses. As one can observe in Fig. 3.4(b) the selec- 
tion of points is rather straightforward. 

In the first step of the optimization procedure, the design specifications are 
modified (or mapped) so that the level of satisfying/violating the modified specifi- 
cations by the coarse model response corresponds to the satisfaction/violation lev- 
els of the original specifications by the fine model response. Modified design 
specifications are shown in Fig. 3.4(c). 

The coarse model is subsequently optimized with respect to the modified speci- 
fications and the new design obtained this way is treated as an approximated solu- 
tion to the original design problem (i.e., optimization of the fine model with 
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Fig. 3.4 Bandstop filter example (responses of Rf and R^ are marked with solid and dashed 
line, respectively) [11]: (a) fine and coarse model responses at the initial design (optimum 
of R^) as well as the original design specifications, (b) characteristic points of the responses 
corresponding to the specification levels (here, -3 dB and -30 dB) and to the local response 
maxima, (c) fine and coarse model responses at the initial design and the modified design 
specifications. 



respect to the original specifications). Steps 1 and 2 (listed above) can be repeated 
if necessary. Substantial design improvement is typically observed after the first 
iteration, however, additional iterations may bring further enhancement [11]. 

In the first step of the optimization procedure, the design specifications are 
modified (or mapped) so that the level of satisfying/violating the modified specifi- 
cations by the coarse model response corresponds to the satisfaction/violation lev- 
els of the original specifications by the fine model response. It is assumed that the 
coarse model is physically-based, in particular, that the adjustment of the design 
variables has similar effect on the response for both Rf and R^.. In such a case the 
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coarse model design that is obtained in tiie second stage of the procedure (i.e., op- 
timal with respect to the modified specifications) will be (almost) optimal for Rf 
with respect to the original specifications. As shown in Fig. 3.4, the absolute 
matching between the models is not as important as the shape similarity. 

3.3 Surrogate-Based Design Optimization of Microwave Filters 

In this section, three examples of microwave filter design using various surrogate- 
based optimization techniques are presented. A common feature of these three 
cases is that the surrogate model is created exploiting equivalent-circuit coarse 
model which is computationally much cheaper than the EM-simulated high- 
fidelity model. This results in a significant speedup of the optimization process. 

3.3.1 Optimization of a Microstrip Bandpass Filter Using Space 
Mapping Technique 

Consider the fourth-order ring resonator bandpass filter [22] shown in Fig. 3.5(a). The 
design parameters are jc = [L\ L2 L3 Si S2 Wi W2Y mm. The fine model Rf is simulated 
in the EM simulator FEKO [23]. The coarse model. Fig. 3.5(b), is an equivalent circuit 
implemented in Agilent ADS [24]. The design goal is to adjust the design variables so 
that the modulus of the transmission coefficient of the filter, I5'2il, satisfies the follow- 
ing requirements: l^zJ > -1 dB for 1.75 GHz </< 2.25 GHz, and ISjil < -20 dB for 
1.0 GHz </< 1.5GHz and 2.5 GHz </< 3.0 GHz, where /stands for frequency. The 
initial design is the coarse model optimal solution jc**''= [24.74 19.51 24.10 0.293 
0. 173 1 .232 0.802]^ mm (minimax specification error -h9.0 dB). 

Table 3.2 shows the optimization results. The surrogate model is constructed 
using input and output space mapping of the form RJ''\x) = Rc'\ x + c*'' ) + rf*'* [7], 
where c*'' is obtained using the parameter extraction procedure [1], see also 
Section 3.2.2, eq. (3.4), whereas rf*'' = /?/(x*'') -/^..(jc*'' -n c*''). Also an enhanced 
model of the form rJ"(x) = rJ"{ x + c*'' ) -1- rf*'' + £*''( x - x*'' ) is considered, where 
£*'* is an approximation of the Jacobian of R/ix) - R^ix + c*'') obtained using 
Broyden update [25]. The space mapping algorithm working with the enhanced 
surrogate uses trust-region convergence safeguard [25]. 

Figure 3.6 shows the initial fine model response and the optimized fine model re- 
sponse obtained using the algorithm with the enhanced surrogate model. Figure 3.7 
shows the convergence plot for the both cases. For this example, the first version of 
the space mapping algorithm does not converge. Also, the final design is worse 
than the best one found in the course of optimization. This illustrates one of the 
difficulties of the standard SM technique: the algorithm does not ensure objective 
function improvement from iteration to iteration. The algorithm using approxi- 
mated Jacobian and trust-regions exhibits better performance. 
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Fig. 3.5 Fourth-order ring resonator bandpass filter: (a) geometry [22], (b) coarse model 
(Agilent ADS). 
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Fig. 3.6 Fourth-order ring resonator filter: Initial (dashed line) and optimized (solid line) 
I52il versus frequency; optimization using SMxr_b2 algorithm [25] with the Rcix+c) model: 
(a) full frequency range, (b) magnification at 1.4 GHz to 2.6 GHz and -22 dB to dB. 
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Table 3.2 Fourth-order ring-resonator bandpass filter: optimization results 



Surrogate Model 


Spec. Error [dB] 


Fine Model 


Final Best Found 


Runs [times] 


R!'\x) = R/'\ X + c<'' ) -1- rf<'' + E^'\ X - x<'' ) 


-0.2 -0.3 
-0.4 -0.4 


21* 
17 



'Convergence not obtained; algorithm terminated after 20 iterations. 
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Fig. 3.7 Fourth-order ring resonator filter: convergence plots for the SM algorithm using 



surrogate model Rs (x) = Re (. x + c ) -I- d (o) and the algorithm using model R^ (x) = 

R,^'\ X + c<'> ) + rf<'> -I- £<'■'( X -x<'> ). 



3.3.2 Optimization of a Microstrip Bandpass Filter Using Tuning 
Space Mapping 

Consider the box-section Chebyshev microstrip bandpass filter [26] (Fig. 3.8). The 
design parameters are jc = [Li Lj L^ L4 L5 ^i ^2]^. The fine model is simulated 
in Sonnet em [27] with a grid of 1 mil x 2 mil. The width parameters are 
W=40mil and Wi = 150mil. Substrate parameters are: relative permittivity 
£", = 3.63, and height //=20mil. The design specifications for the transmission 
coefficient are l52il<-20dB for 1.8 GHz </< 2.15 GHz and 
2.65 GHz </< 3.0 GHz, and l^z,! > -3 dB for 2.4 GHz </< 2.5 GHz. 

The filter is optimized using the tuning space mapping technology (Section 3.2.3). 
The tuning model is constructed by dividing the polygons corresponding to parameters 
Li to L5 in the middle and inserting the tuning ports at the new cut edges. Its S28P data 
file (i.e., the file generated by the EM solver and containing the fine model S- 
parameter data) is then loaded into the ^-parameter component in Agilent ADS [24]. 
The circuit-theory coupled-line components and capacitor components are chosen to 
be the tuning elements and are inserted into each pair of tuning ports (Fig. 3.9). The 
lengths of the imposed coupled-lines and the capacitances of the capacitors are as- 
signed to be the tuning parameters, so that one has x, = [L,i La L,^ L,^ L^ C,[ C(^ (L,j. in 
mil, Crt in pF). 

The calibration model is implemented in ADS and shown in Fig. 3.10. It con- 
tains the same tuning elements as the tuning model. It basically mimics the 
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Fig. 3.8 Chebyshev bandpass filter: geometry [26], and the tuning port insertion points. 
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Fig. 3.9 Box-section Chebyshev bandpass filter: tuning model (Agilent ADS). 



division of the coupled-lines performed while preparing R,. The calibration model 
also contains six (implicit) SM parameters that will be used as parameters p in the 
calibration process [8]. These parameters are/? = [f,i £r2 £"<3 £rA £,6 hY, where £rk is 
dielectric constant of the microstrip line segment of length Lj. (Fig. 3.8), and H is 
the substrate height of the filter. Initial values of these parameters are [3.63 3.63 
3.63 3.63 3.63 20f. 

The initial design, x*°' = [928 508 50 50 201 5 19]^ mil, is the optimal solution 
of the coarse model, i.e., the calibration model with zero values of the tuning pa- 
rameters. The specification error is H-19 dB. 
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Fig. 3.10 Box-section Chebyshev bandpass filter: calibration model (Agilent ADS) [28]. 



The misalignment between the fine and the tuning model response with the tun- 
ing elements set to zero is neghgible (thanks to the co-calibrated port feature in 
Sonnet em [8]) so that JC,o'°' = [000000 0]^ was used throughout. The values of 
the tuning parameters at the optimal design of the tuning model are jc, / ^ = [-85.2 
132.5 5.24 1.13 -15.24 0.169 -0.290]^. Note that some of the parameters take 
negative values, which is permitted in ADS. The values of preassigned parameters 
obtained in the first calibration phase [8] are p*°' = [3.10 6.98 4.29 7.00 6.05 
17.41]^. 
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Fig. 3.11 Box-section Chebyshev bandpass filter: (a) the coarse (dashed line) and fine (sol- 
id line) model response at the initial design; (b) fine model response at the design found af- 
ter one iteration of the TSM algorithm. 

Figure 3. 11 shows the coarse and fine model response at the initial design, as well 
as the fine model response after just one TSM iteration (two fine model evaluations) 
withjc"' = [1022 398 46 56 235 4 10]^mil (specification error-1.8 dB). 

It should be emphasized that the evaluation time of both the tuning and the cali- 
bration model is very low (a fraction of a second), and, it is negligible compared 
to the evaluation time of the fine model. Therefore, the computational cost of each 
tuning space mapping iteration corresponds to two electromagnetic simulations 
(one for the fine model and one for the "cut" fine model). 



3.3.3 Design of Dual-Band Bandpass Filter Using Shape- 
Preserving Response Prediction 

To illustrate the performance of the shape-preserving response prediction (SPRP) 
algorithm [9] (Section 3.2.4), consider the dual-band bandpass filter [29] shown in 
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Fig. 3.12. The design parameters are jc = [L, L2 S\ S2 Sj, d g W\ mm. The fine 
model is simulated in Sonnet em [27]. The design specifications are l^ail > -3 dB 
for 0.85 GHz </< 0.95 GHz and 1.75 GHz </< 1.85 GHz, and IS21I < -20 dB for 
0.5 GHz </< 0.7 GHz, 1.1 GHz </< 1.6 GHz and 2.0 GHz </< 2.2 GHz. The 
coarse model is implemented in Agilent ADS [24] (Fig. 3.13). The initial design is 
jc<°'= [16.14 17.28 1.16 0.38 1.18 0.98 0.98 0.20]^ mm (the optimal solution of 
R^). The following characteristic points are selected to set up the SPRP surrogate 
model [9]: four points for which 15211 = -20 dB, four points with 1^2 1 1 = -5 dB, as 
well as 6 additional points located between -5 dB points. For the purpose of opti- 
mization, the coarse model was enhanced by tuning the dielectric constants and the 
substrate heights of the microstrip models corresponding to the design variables Li, 
L2, d and g (original values of e^ and H were 10.2 and 0.635 mm, respectively). 

Figure 3.13 shows the initial fine model response as well as the fine model 
response at the design obtained using the stand-alone SPRP. Table 3.3 shows the 
optimization results. Two variants of the SPRP algorithm were considered [9]: 




Fig. 3.12 Dual-band bandpass filter: geometry [29]. 
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Fig. 3.13 Dual-band bandpass filter: coarse model (Agilent ADS). 
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Table 3.3 Optimization Results for Dual-Band Bandpass Filter 



Algorithm 



Final Specification Error [dB] Fine Model Runs [times] 



SPRP 
SPRP + input SM- 
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-1.9^ 



Excludes the fine model evaluation at the starting point. 
Surrogate model is /{/''(x) 

Design specifications satisfied after the first iteration (spec, error 
Design specifications satisfied after the first iteration (spec, error 



R^x + c*''); c*'' is found using parameter extraction [9]. 
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Fig. 3.14 Dual-band bandpass filter: fine model (dashed line) and coarse model (thin 



dashed line) response at x 
sign obtained using SPRP. 



and the optimized fine model response (solid line) at the de- 



stand-alone and combined with input SM. Note a very small number of fine model 
evaluations necessary to yield the optimized design. 

3.4 Surrogate-Based Design Optimization of Antennas 



Building a surrogate model may not be straightforward for certain types of micro- 
wave devices since reliable circuit equivalents, as those used in the previous sec- 
tion for planar microwave filters, may not be available for many types of antennas, 
e.g., ultra wide band (UWB) antennas, Yagi-type of antennas, or dielectric resona- 
tor antennas. Often, the only way to create a surrogate model is to use a coarsely- 
discretized full-wave EIVI model which is evaluated using the same EM solver as 
the one used for the high-fidelity model. However, coarsely-discretized EM simu- 
lation is still relatively expensive so that typically only a limited number of such 
simulations can be afforded. One way to deal with this situation is to generate 
smooth and computationally inexpensive surrogate by approximating sampled 
coarse-discretization EM data. The surrogate created this way can be then used in 
the space mapping optimization process. Another possibility is to exploits tech- 
niques that do not require excessive number of coarse-discretization EM simula- 
tions. Two of such methods — adaptive design specifications and multi-fidelity 
optimization algorithm — are also demonstrated in this chapter for antenna design. 
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3.4.1 Design of UWB Antipodal Vivaldi Antenna Using Coarsely- 
Discretized EM Models, Kriging and Space Mapping 

The example considered here, a UWB antipodal Vivaldi antenna [30] of Fig. 3.15, 
shows how to combine functional and physical models to build a surrogate. De- 
sign variables are jc = [ui 02 ^1 ^3 h\ /12 d\Y- The profile of the antipodal metal fins 
is with arks of ellipses; for the upper fin they are: BC, DE, and DB. The point A is 
the center of two ellipses with the arks of BC and DE, and the semiaxes of Oi and 
bi and 02 and bj, respectively. The point F is the center of the ellipse with the se- 
miaxes of ^3 and b^. Note that here 133 = (aj-Oi)/!, b2 = b^+Ws, and d2 = d\. Other 
parameters are fixed: w, = 2.15, W\ = 12.9, and h^ = 5 (all in mm). Antenna metal- 
lization is with 0.05 mm copper. The fins are interfaced with the microstrip input 
(width of the ground of wi) through the linear taper of length hj. Rogers RT5880 
(0.787 mm thick) is for the substrate of finite extends, and dielectric losses are 
maximal at 10 GHz. 

The design specifications for reflection are l^nl < -10 dB for 3.1 GHz to 10.6 
GHz. Total lateral and longitudinal dimensions are constrained by 100 mm and 
200 mm, respectively. The antenna models include an edge mount SMA connector 
(AEP part number: 9650-1113-014) [31] and its hex nut since their presence, as it 
was seen from numerical experiments, can affect the radiation pattern, e.g., tilt the 
main beam from the end-fire direction, change the gain in the back direction, etc. 
The connector pin extends 0.5 mm from the flange over the microstrip signal trace. 
The upper connector tips, the lower connector tips, and the microstrip ground are 
connected with a pair of vias (1 mm in diameter) going through the substrate. 

The mismatch level of the connector-to-input microstrip junction itself is below 
-28 dB in the bandwidth of interest. The antenna models are excited through the 50 
ohm coaxial port which is in the SMA connector. 

The initial design is x'" = [30 50 10 10 100 20 2f mm. The high-fidelity antenna 
model is evaluated with the CST MWS transient solver [3] (8,954,244 mesh cells at 
x'", simulation time Ih 45 min). 



B a, A h. 




Fig. 3.15 Vivaldi antenna: top view, substrate shown transparent. 



3 Simulation-Driven Design in Microwave Engineering: Application Case Studies 



77 



Here, a suitable equivalent-circuit coarse model is not available to apply opti- 
mization using space mapping. Instead, a coarse-discretization CST model R^^ 
(1,039,008 mesh cells at jc'", evaluation time 6 minutes) is used. /?„, is still compu- 
tationally too expensive to be used directly as a coarse model, therefore, a coarse 
model R^ is created in the neighbourhood of the starting point (here, the approxi- 
mate optimum of Red), using kriging interpolation [5] of the R^d data. The proce- 
dure is as follows. 

1. Allocate A? base designs, Xg = {jc', ...,x^}, using Latin Hypercube Sampling [32]; 

2. Evaluate R^d at each design x',j= 1,2, . . . , A'^; 

3. Build /?r as a kriging interpolation of data pairs {ix',Rcdix'))}j= i,...,w. 

The coarse model created this way is computationally cheap, easy to optimize, and 
yet retains the features of a physically-based model. The starting point for space 
mapping optimization, x'°*= [37.57 32.85 25.75 53.34 122.55 32.31 1.129]^ mm, is 
the approximate optimum of R^d- The kriging coarse model R,. is set up in the vi- 
cinity of jc*^"' using A^= 100 base points. 

Figure 3.16 shows the fine model reflection response at the initial design as well 
as that of the fine and coarse-discretization model R^d at x''°\ The final design, 
jc<^'=[37.66 33.16 25.21 53.22 122.50 33.06 1.012]^ mm, is obtained after two space 
mapping iterations (Fig. 3.17). The surrogate model used by the optimization algo- 
rithm exploited input and output space mapping of the form Rs(x) = Rdx + c) +d 
[7]. Optimization costs are summarized in Table 3.4. The total design time corre- 
sponds to about 16 evaluations of the fine model. It should be noted that the design 
improvement between jc'*" and jc^^' is somehow limited, which is because of a limited 
accuracy of the coarse model as shown in Fig. 3.16. The far-field response of the fi- 
nal design at selected frequencies is shown in Fig. 3.18. 
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Table 3.4 UWB Vivaldi antenna: optiinization cost 



Algorithm Component 


Number of Model 


CPU Time 




Evaluations 


Absolute 


Relative to R, 


Optimization of R^d 


135 x/J,, 


13.5 hours 




7.7 


Setting up Re 


lOOxR,j 


10.0 hours 




5.7 


Evaluation of Rf 


3 xR/ 


5.3 hours 




3.0 


Total cost 


N/A 


28.8 hours 




16.4 




60 



90 



120 



Fig. 3.18 Gain [dBi] of the Vivaldi antenna, x-pol. component: pattern cut in YOZ plane at 4 

GHz (— ), 6 GHz (- - -), 8 GHz ( ), and 10 GHz (• • •). 90" on the left, O", and 90° on the 

right are for Y, Z, and -y directions, respectively. 



3 Simulation-Driven Design in Microwave Engineering: Application Case Studies 



79 



3.4.2 Design of Planar Yagi Antenna Using Adaptive Design 
Specifications 

Performance of the adaptive design specifications methodology [11] can be dem- 
onstrated with design optimization of a planar Yagi antenna for the 2.4-2.5 GHz 
band [33]. Optimization of planar Yagi antennas on finite substrate is a challeng- 
ing task due to the finite substrate and proximity of the feeding circuitry to the ra- 
diators both introducing additional degrees of freedom to the design as well as 
complicate the use of methods developed for Yagi aerials [34, 35] and permits a 
limited use of existing design techniques [36]. 

Design geometry. The considered Yagi antenna comprises three directors, one driv- 
ing element of a modified shape consisting of partially overlapping strips, and the 
feeding microstrip ground plane serving also as the reflector. The presented antenna 
can be viewed as a planar realization of the five-element Yagi. An outline of the an- 
tenna is given with Fig. 3.19. The antenna components are defined on a single 
layer of 0.025" thick Rogers RT6010 substrate which has extends of 100 mm x 
160 mm. The ground extend is 100 mm x 40 mm. The input 50 ohm microstrip is 
to be interfaced to the terminals of the driving element through a section of the pa- 
rallel strip transmission line in a way that provides the balanced input to the an- 
tenna. The antenna model is defined with CST MWS, discretized with subgrids, 
and simulated using the CST transient solver. 

Design objectives. Maximum directivity of the principal polarization (E-field is 
parallel to XOZ plane) in the 2.4-2.5 GHz band is chosen as the main objective. 
The following antenna figures are treated as constraints (also in the 2.4-2.5 GHz 
band): the side lobe level relative to maximum (SLL < -10 dB), front-to-back ratio 
(FBR < -12 dB), direction of maximal radiation (9„ (elevation angle from the 
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Fig. 3.19 Printed quasi- Yagi antenna: (a) the model used at the optimization stage of design 
(no feeding section), (b) the model updated with a feeding section starting from the 50 ohm 
microstrip. Source impedance is not shown at the diagrams. For simplicity, the feeding section 
at the panel (b) is shown as a simple two section structure: 50 ohm microstrip (dimensions 
/„and w„) and parallel strips (dimensions /pi and Wpi). Detailed geometry of the optimized 
feeding section is given with Fig. 3.21(a). 
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Z axis, \0J < 1°). The antenna should be interfaced to the 50 ohm environment so 
that ISiil < -10 dB in the 2.4-2.5 GHz band. 

Design stages. As the input impedance of Yagi antennas is typically sensitive to var- 
iations of antenna dimensions [36], and since its value is not available prior to simu- 
lation while it is needed to define the feeding part of the antenna, the design optimi- 
zation proceeds in two major steps as follows: First, the antenna is optimized for 
maximal directivity subject to the constraints on SLL, FBR, and 8,„. At this step the 
excitation is applied directly at the driving element's terminals (Fig. 3.19(a)). Design 
optimization procedure is based on surrogate-based optimization and involves opti- 
mization of a coarse-discretization antenna model. Having the optimal design, the 
feed interfacing the 50 ohm input and the driving element terminals, is designed. 

Optimization methodology. Optimization of the coarse-discretization antenna 
model is carried out much faster. However, the coarse-discretization model is also 
less accurate: the figures of interest (e.g., directivity, SLL, FBR) are shifted in fre- 
quency with respect to those of the high-fidelity model. The frequency relationship 
between the two models using characteristic points (e.g., local maxima, points of 
corresponding response levels) is captured as shown in Fig. 3.20. Using this rela- 
tionship, the original frequency band of interest is mapped into the corresponding 
band that is used in the optimization of the coarse-discretization model. This proce- 
dure, i.e., mapping of the frequency band, optimization of the coarse-discretization 
model and evaluation of the high-fidelity model is performed a few times as the fre- 
quency dependence between the models' responses may change from one design to 
another. The high-fidelity model is only evaluated a few times for verification pur- 
poses and to set up a new mapping. 

Results. The design variables when optimizing the antenna for maximal directivity 
are x = [l[ Ij h h si S2 s-i s^ Wq w,,]^ (Fig. 3.19(a)). Other parameters are fixed: 4=160, 
Ws= 100, Ig = 40, Wg = 100, and h = 0.635 (all in mm). The initial design is 
jc<°'= [40.42 35.7 31.5 27.3 17.85 22.05 22.05 22.05 2.35 1.5]^. Simulation time of 
the coarse-discretization model (51,580 cells at jc*"') is about 6 minutes, and it is 
about 2 hours for the original, high-fidelity model (1,096,980 cells atx*°'). The opti- 
mum is found at/= [40.87 37.31 34.33 29.80 17.35 22.55 23.05 24.55 1.55 2.13]^. 

Based on the optimum jc and the antenna impedance at the driving element ter- 
minals Z, (Fig. 3.19(a)), a feed is designed (Fig. 3.21) using analytical formulas with 
a microstrip (/„=35 mm, w„= 0.586 mm) and parallel strips {lpi = Si-Wo/2, 
Wpi = 0.36 mm). The updated antenna model is then simulated. Its reflection does not 
meet the design specifications for frequency over 2.484 GHz. Therefore, the feed is 
redesigned with geometry of Fig. 3.21(a) through optimization of its full-wave mod- 
el and a schematic of Fig. 3.21(b). Dimensions of the simple feed are used as an ini- 
tial guess. Optimal feed dimensions are found to be [Wp[ Wp2 Wps lp[ Ip^f = [0.428 
0.275 0.245 0.575 8.08]^. The updated antenna model is then simulated, and re- 
sponses are shown in Figs. 3.22 through 25 and Table 3.5. Table 3.6 shows the com- 
putational cost of the optimization process, which corresponds to only 19 full- wave 
antenna simulations. 
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Fig. 3.20 Directivity versus frequency for the antenna structure (sohd line) and its coarse- 
discretization model (dashed line). Characteristic points (squares and circles) are used to estab- 
Ush a frequency relationship between the two responses and to map the original frequency band 
of interest (2.4 to 2.5 GHz, thick solid line) into the corresponding band used in the coarse- 
discretization model optimization (thick dashed line). 






n 



^P^ Wp2 ^/'l 



_Lj- 



}py\.Jpl 

Si-Wpll 



I pi 



£,. 



Iy z 



(a) 



50 n 




rT" 




(b) 



Fig. 3.21 A feed interfacing the 50 ohm input and the driving elements: (a) geometry of its 
full-wave model; (b) implemented schematic. 
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Fig. 3.22 Antenna impedance: resistance Rj„ (thick solid) and reactance X,„ (thick dash) at 
the antenna input (Fig. 3.19(b)); R, (solid) and A", (dash-dot) at the driving element terminals 
(Fig. 3.19(a)). 
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Table 3.5 Printed Yagi antenna: performance summary 



Figure 



Value 



Directivity, maximum 
Directivity, end fire maximum 
IEEE gain, end fire maximum 
Radiation efficiency, minimum 
Front to back ratio (FBR), minimum 
Side lobe level (SLL), maximum 
End-fne polarization purity, minimum 

3 dB beamwidth at 2.45 GHz 

Relative bandwidth ([Su| < -10 dB) 
Input impedance at resonance (2.466 GHz) 



lOdBi 

9.85 dBi 

9.49 dBi 

92% 

15.5 dB 

-10.2 dB 

40 dB 

E-plane: 59°, 

H-plane: 74° 

4.4% 

54.6 ohms 



* maximum/minimum over 2.4 GHz to 2.5 GHz 



Table 3.6 Printed Yagi antenna: optimization cost summary 



Algorithm Component 



# of Model 
Evaluations 



Absolute 
Time 



Relative Time 



Coarse-discretization model optimization 316 32 h 

High-fidelity antenna simulation 3 6 h 

Total optimization time - 38 h 



16 

3 

19 



Total number of evaluations (coarse-discretization model is optimized once 

per iteration, two iterations were performed in total). 
^ Evaluation at the initial design and after each iteration. 
' Equivalent number of high-fidelity antenna simulations. 
'' Does not include the time necessary to design the antenna feed, which is 

negligible compared to the optimization time of the antenna itself. 
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Fig. 3.23 Reflection from the input of the Yagi antenna. 
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Fig. 3.24 Front-to-back ratio. 
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Fig. 3.25 Directivity pattern at 2.4 GHz (solid), 2.45 GHz (dash-dot), and 2.5 GHz (dash): 
(a) co-pol. in the E-plane (XOZ); (b) x-pol. in the H-plane (YOZ). 
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3.4.3 Multi-fidelity Design of Microstrip Broadband Antenna 



Application of the multi-fidelity optimization algorithm [10] is demonstrated be- 
low using the broadband antenna [37] shown in Fig. 3.26. Here, jc = [h h h h Wi 
w-i d\ sY are the design variables. Multilayer substrate is ^x/, ((5=30 mm). The 
stack (from bottom-to-top) is: ground, RO4003, signal trace, RO3006 with a 
through via (trace-to-patch), the driven patch, RO4003, and four patches. Feeding 
is with 50 ohm SMA connector. 

The design objective is l^ul < -10 dB for 3.1 GHz to 10.6 GHz. IEEE gain not less 
than 5 dB for the zero elevation angle over band is an optimization constrain. The ini- 
tial design is JC*"* =[15 15 15 15 20 ^2 2]^ mm. Two coarse-discretization models are 
used: R^ (122,713 mesh cells at jc*"') and R^2 (777,888 mesh cells). The evaluation 
times for/?r.|> Rc.i and Rf (2,334,312 mesh cells) are 3 min, 18 min and 160min atx*"', 
respectively. Figure 3.27(a) shows the responses of /J^.i atx*°' and at its optimal design 



X . Figure 3.27(b) shows the responses of /?c.2 at JC and at its optimized design x 



(I) 



J2) 



m 



>.(2) 



Figure 3.26(c) shows the responses of Rf at x , at JC and at the refined design jc = 
[14.87 13.95 15.4 13.13 20.87 -5.90 2.88 0.68]^mm (IShI < -11.5 dB for 3.1 GHz to 
4.8 GHz) obtained in two iterations of the refinement step [10], see also Section 3.2.5, 
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Fig. 3.26 Microstrip broadband antenna: top/side views, substrates shown transparent. 



Table 3.7 Microstrip broadband antenna: design cost 



Design Step 


Model Evaluations - 


Computational Cost 


Absolute [hours] 


Relative to Rf 


Optimization of R^a 


125 X /J, , 


6.3 


2.6 


Optimization of /?c2 


48 X R^2 


14.4 


5.4 


Setup of model q 


llxRci 


5.1 


1.9 


Evaluation of Rf 


2xRf 


5.3 


2.0 


Total design time 


N/A 


31.1 


11.9 



Excludes Rf evaluation at the initial design. 



3 Simulation-Driven Design in Microwave Engineering: Application Case Studies 



85 



-10 



-20 



m 
-a 



\ .'V ^^ \ \ /< 



-10 



-20 



-10 



-20 



3.5 4 4.5 

Frequency [GHz] 

(a) 



^N __/f 



3.5 4 4.5 

Frequency [GHz] 

(b) 



"^ 


. ...■•■■■• 








f^ 






1 — "■ 




1 k' 




''\t^' 




^N^N^ 



3.5 4 4.5 

Frequency [GHz] 

(c) 



Fig. 3.27 Microstrip broadband antenna: (a) responses of the coarse-discretization model R^a at 
the initial design j:'"' ( — ) and at the optimized design x*" ( — ); (b) responses of the coarse- 
discretization model R^2 at *"* ( ) and at its optimized design x*^' ( — ); (c) responses of the 

high-fidelity model /{y at x' ' (■■■■), atx*"* ( ) and at the refined final design x ( — ). 
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Fig. 3.28 Microstrip antenna, gain [dBi] of the final design at 3.5 GHz (■ - •), 4.0 GHz ( ), 

and 4.5 GHz ( — ): (a) co-pol. in the 5-plane {XOZ), and connector is at 90 on the right; (b) x- 
pol., primary, (thick lines) and co-pol. (thin Unes) in the //-plane (YOZ). 
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eq. (3.13). The design cost (Table 3.7) corresponds to about 12 runs of the high-fidelity 
model Rf. Antenna gain at the final design is shown in Fig. 3.28. 

3.5 Surrogate-Based Design Optimization of Microwave 
Transitions 

Design of low-loss broad-band transitions interfacing different type of transmission 
lines at microwave frequencies usually involves full-wave EM simulation to accu- 
rately describe the transition responses [38, 39]. Circuit models and analytical 
formulas, when available, can only be used to get initial designs which should be ve- 
rified and tuned for required design requirements. Typically, reliable circuit 
models are either unavailable or require significant amount of development and va- 
lidation effort. Moreover, additions or modifications introduced in the transition ge- 
ometry may invalidate existing models, which leads to repeating the model devel- 
opment procedure. On the other hand, optimization techniques exploiting surrogates 
[40], including those based on coarsely-discretized EM models, may substantially 
reduce the computational complexity of the conventional optimization methods and, 
at the same time, be applied to modified/improved geometries without extra effort. 

Two examples are presented in this section. The first one illustrates the multi- 
fidelity design optimization technique to improve performance of a coplanar wa- 
vequide-to-microstrip transition based on EM coupling. The second example 
demonstrates the use of the adaptive design specifications method for design of 
coplanar waveguide-to-substrate integrated waveguide transition. In both cases the 
use of coarsely-discretized EM models is essential, since no accurate circuit equi- 
valents are available for the considered structures. 

3.5.1 Multi-fidelity Design of Microstrip-to-Coplanar Waveguide 
Transition 

Here, the multi-fidelity optimization algorithm [10] is applied to design optimization 
of a microstrip-to-CPW transition [41]. The methodology exploits sequential optimiza- 
tion of coarse-discretization EM models. The optimal design of the current model is 
used as an initial design for the finer-discretization one. The final design is then refined 
using a polynomial-based approximation model of the responses obtained from the 
coarse-discretization simulations. The design process is computationally efficient be- 
cause the optimization burden is shifted to the coarse-discretization models. 

Two frequency bands with the center frequencies fc of 5 GHz and 10 GHz are of 
interest for this transition [41] (Fig. 3.29). The port-to-port distance is 20 mm. The 
transition geometry and the input transmission lines (TLs) are on 0.635 mm thick 
RT6010 substrate. Metallization (5.7e8 S/m) is 0.0254 mm thick. Dimensions of 
the input TLs are the following: Wm=0.6, Wc=0.8, and 5c=0.3 (all in mm). The 
ground plane is common to the CPW and microstrip and it is modelled of infinite 
lateral extend. The low frequency TL impedances are about 50 ohms each. All 
models are simulated using the GST MWS transient solver. 
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Fig. 3.29 Coplanar waveguide-to-microstrip transition with EM coupling [41]: (a) 3D view, 
substrate shown transparent; (b) layout views. 



In this example the ground plane is modelled to be of infinite lateral extend. The 
design objective is the 50% symmetrical bandwidth at the -20 dB level for both ISiJ 
and IS22I- The design variables are x = [Li W[ W2 L2 i^]^ nun- Designs start from jc*"' = 
[L*°\ 0.8 0.3 0]^, where L<°\ = 6 and 3 for designs of 5 GHz and 10 GHz, respec- 
tively. For this example one again uses two coarse-discretization models R^ and Rc2 
with the following evaluation times: 60s and 100s (5 GHz) and 71 s and 106 s 
(10 GHz). The fine model evaluation time is 17 min (5 GHz) and 26 min (10 GHz). 

The optimal designs are found to be/ = [6.200 1.105 0.113 0.319 -0.033]^ for 
5 GHz and [2.877 1.017 0.038 0.287 -0.090]^ for 10 GHz. Both final designs 
meet the specifications completely, which is shown in Fig. 3.30. Figure 3.31 
shows the transmission responses I5'2|I versus frequency at the initial and final de- 
signs. Significant improvement in reflection and transmission responses (in level 
and bandwidth) is achieved: the bandwidth was extended to 53% from initial 0% 
for the 5 GHz design and to 51% from initial 20% for the 10 GHz design. Design 
cost are 9.3 evaluations of the fine model for the 5 GHz design and 7.0 evaluations 
of the fine model for the 10 GHz design (see Table 3.8 for details). 

As a comparison with "classical" simulation-driven design, the transition for 
/r = 10 GHz has been also designed through direct optimization of the fine model 
using the pattern search algorithm [42]. The final design obtained this way is al- 
most as good as that produced by the multi-fidelity technique (50% bandwidth), 
however, the design cost is almost 18 times higher (124 evaluations of /inversus 
about 7 for the multi-fidelity algorithm). 

The reduced quadratic model [10], see also Section 3.2.5, eqs. (3.11)-(3.12), is 
also utilized to perform sensitivity analysis of the final designs. For this purpose, 
however, the quadratic model is set up using the high-fidelity model data. Because 
sensitivity analysis is performed assuming relatively small deviations around the 
optimized design (0.0125 and 0.025 mm for geometry variables), the accuracy of 
the quadratic model is sufficiently good with respect to Rf. Results of the sensitiv- 
ity analysis are shown in Fig. 3.32. 
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Fig. 3.30 Transition through couphng: fine model responses at initial (dashed hne) and final 
design (solid line) for (a)/^ = 5 GHz, and (b)/^ = 10 GHz; -20 dB bandwidth at the final de- 
sign marked with horizontal line. 
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Fig. 3.31 Transition through coupling: transmission response at the initial (dashed line) and 
the final design (solid line) for (a)/^ = 5 GHz and (b) /^ = 10 GHz; -20 dB bandwidth at the 
final design marked with horizontal line. 
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Table 3.8 CPW-microstrip transition: design cost 



Center Design Procedure 



Number of 



Evaluation Time 



frequency 


Component 


Model Evaluations 


Absolute [min] 


Relative to if/ 




Optimization of if^, i 


62 


62 


3.6 




Optimization of if (2 


27 


45 


2.6 


5 GHz 


Setup of model q 


ll(if..2) 


18 


1.1 




Evaluation of if/- 


2 


34 


2.0 




Total design time 


N/A 


159 


9.3 




Optimization of if^, i 


54 


64 


2.5 




Optimization of if;,2 


26 


46 


1.8 


10 GHz 


Setup of model q 


ll(if..2) 


19 


0.7 




Evaluation oiRf 


2 


52 


2.0 




Total design time 


N/A 


181 


7.0 
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Fig. 3.32 Transition through coupling: sensitivity analysis using 200 random samples allo- 
cated in the neighbourhood of the optimized designs: (a) /^ = 5 GHz and (b) /^ = 10 GHz; - 
20 dB bandwidth at the final design marked with horizontal line. The sensitivity analysis 
setup is described in the text. Thick solid lines denote transition responses at optimized de- 
signs. Thin lines represent the family of responses conesponding to random samples as de- 
scribed above. 
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3.5.2 Design ofCoplanar Waveguide-to-Substrate Integrated 
Waveguide Transition 

Substrate integrated circuits (SICs), and substrate integrated waveguides (SIWs) in 
particular, find application in modern microwave and millimeter wave engineering 
due to their capability of low cost realization of waveguide components as well as 
integration of different components all in the frame of planar technology [43]. One 
of the major tasks in SIC transition design is the adjustment of geometry parame- 
ters so that given design specifications are satisfied. For research in the computer 
aided design (CAD), this casts into a problem of developing straightforward and 
reliable procedures to tune geometries of SIC transitions for required performance 
in a given environment. 

Increasing a useable bandwidth of conductor backed coplanar waveguide 
(CBCPW)-to-SIW transitions is targeted here. Metalized vias partially protruding 
into substrate in the transition region are used as tuning elements, and the surro- 
gate-based optimization [7] is applied as a design tool. Adjustable metal screws 
and pins are classical tuning elements in hollow waveguides and cavities [44]; 
however, they are not used in SICs in the similar way since post-manufacturing 
adjustment of SICs is hardly possible, and finding optimal position, diameter, and 
protruding depth of vias represents a challenging task in the case of SIC. Design of 
SIC transitions can be conducted successfully by means of surrogate-based optimiza- 
tion [7], [45] and coarse-discretization electromagnetic models [46] with the transition 
dimensions considered as design optimization variables. 

Examples include: (i) design optimization of a transition interfacing the conductor 
backed coplanar waveguide (CBCPW) to SIW without vias (not capable to satisfy the 
design specifications), and (ii) re-optimized transitions with metalized vias protruding 
into substrate in the transition region, which improves the usable bandwidth. 

Geometry under design. Consider the planar transition interfacing a conductor 
backed coplanar waveguide (CBCPW) to a SIW shown in Fig. 3.33. The CBCPW, 
SIW, and transition are on the 3.175 mm RT5880 substrate. The CBCPW upper and 
lower grounds, the SIW top and bottom walls are of infinite lateral extend. All metal 
parts have conductivity of copper (5.8e7 Sim). Metallization of the CBCPW signal 
trace, CBCPW upper ground, and SIW top wall is with 1.5 oz copper (~ 0.05 mm). 
Design specifications are l^nl, I522l<= ^20 dB for the X-band (here 8.2 GHz to 
11.7 GHz). 

The dimensions of the input CBCPW are: signal trace width wq = 2.25 mm; slot 
width So = 0.2 mm; spacing between the rows of vias Mq = 6.95 mm; spacing be- 
tween vias in the row Vq = 2 mm; via diameter d = I mm. The dimensions of the 
input SIW are: spacing between the rows of vias, M2 = 15.95 mm; spacing between 
vias in the terminating rows vi = 1.5 mm; spacing between vias in the row V2 = 2 
mm; via diameter is the same as in the CBCPW (1 mm). The cutoff of the SIW's 
quasi TEm dominant mode is at 6.55 GHz. 
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Fig. 3.33 CBCPW-to-SIW transition with two vias added: top and side views. The via walls 
are not shown on the side view. The dash-dot line shows the location of the symmetry plane 
(magnetic wall). 



The transition comprises a CBCPW section, the probe connecting the CBCPW 
signal trace to the SIW bottom wall, and pairs of protruding vias. The length of the 
transition, i.e., /, + Ij in the case without protruding vias, and /i + y, in the case of 
two protruding vias, is constrained to 20 mm (~ 1.25 of M2). Figure 3.2 gives a 
conceptual view of the transition with two extra vias. Via location, x\, yi, radius, 
ri, and protruding depths, hi, will be additional (to the dimensions of the CBCPW 
section) design variables. In the CAD models, the port-to-port distance is 55 mm 
of which 25 mm is for the SIW section. The SIW is excited through a 2.5 mm sec- 
tion of the equivalent rectangular waveguide [46]. The 50 ohm CBCPW wave- 
guide port has a perfect metal periphery connecting the upper and lower CBCPW 
grounds. The extra vias protrude into the dielectric from the SIW bottom wall but 
they are not allowed to touch the SIW top wall. 

Design process. The first step of the optimization process is to optimize the 
coarse-discretization EM model of the transition (low-fidelity model) using pat- 
tern search [42]. The design is further improved using adaptively adjusted design 
specifications technique [46, 47] which consists of the following two steps: 
(i) Modify the original design specifications to account for the discrepancy be- 
tween the low- and high-fidelity models; (ii) Obtain a new design by optimizing 
the low-fidelity model with respect to the modified specifications. 

In Step (i), the design specifications are modified so that the level of satisfy- 
ing/violating the modified specifications by the low-fidelity model response 
corresponds to the satisfaction/violation levels of the original specifications by the 
high-fidelity model [46]. The low-fidelity model is then optimized in Step (ii) with 
respect to the modified specifications and the new design obtained this way is 
treated as an approximated solution to the original design problem (i.e., optimization 
of the high-fidelity model with respect to the original specifications). Steps (i) and 
(ii) can be repeated if necessary. Typically, a substantial design improvement is ob- 
served after the first iteration. Additional iterations may bring further enhancement 
as the discrepancy between the high- and low-fidelity models may change somehow 
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from one design to another. Figure 3.34 illustrates an iteration of this technique used 
for design of a CBCPW-to-SIW transition. 

It should be noted that employing simulation-driven design based on low-fidelity 
models allows us to fmd optimal designs that might not be obtainable otherwise. 
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Fig. 3.34 Adaptively adjusted design specification technique applied to optimize CBCPW-to- 
SIW transitions. High- and low-fidelity model response denoted as solid and dashed lines, re- 
spectively. I522I distinguished from 15] 1 1 using circles. Design specifications denoted by thick 
horizontal lines, (a) High- and low-fidelity model responses at the beginning of the iteration as 
well as original design specifications; (b) High- and low-fidelity model responses and modi- 
fied design specifications that refiect the differences between the responses; (c) Low-fidelity 
model optimized to meet the modified specifications; (d) high-fidelity model at the low- 
fidelity model optimum shown versus original specifications. 



Results. For the case without extra vias the design variables are JCo,„=[/i h h Wi Si 
S2 fof- The optimization procedure starts from jCo;„=[1.0 7.0 0.75 2.0 0.4 1.0 
0.75]^^. The responses of the initial design are shown in Fig. 3.35 (a) and Fig 
3.36(a). The optimal design is found to be Xq^,,^ [0.695 7.451 0.323 2.387 0.764 
0.235 0.250]^, its responses are shown at Fig. 3.35(a) and Fig 3.36(a). The optimal 
design has the improved reflection and transmission responses compared to the in- 
itial one; however, it does not meet the design requirements. Therefore, additional 
vias are introduced (see Fig. 3.33) as tuning elements. The optimum design for the 
case of two protruding vias is shown in Table 3.9. The response corresponding to 
this design is shown in Fig. 3.35(b) and Fig 3.36 (b). 



3 Simulation-Driven Design in Microwave Engineering: Application Case Studies 



93 



m 
-a 
^ -10 

(N 

^ -20 
-30 



m 
-a 



^ 



^ 






-10 



-20 



-30 



9 10 

Frequency [GHz] 

(a) 



11 



12 













V 








y. 


-^ 


N /-; 


^,;.=— 


"*" — ^/ 



9 10 

Frequency [GHz] 

(b) 



11 



12 



Fig. 3.35 CBCPW-to-SIW transitions, IS,, I (solid) and IS22I (dash): (a) the initial (thin) and 
optimized (thick) design without protruding vias; and (b) optimized design with two vias. 
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solid) design without protruding vias; (b) optimized design with two vias. 
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Table 3.9 CPW-SIW transitions: final designs 

Parameter [mm] Design 1 (no extra vias) Design 2 (two extra vias) 

h 0.695 0.570 

h 7.451 7.389 

h 0.323 0.323 

wi 2.387 2.387 

si 0.764 0.839 

S2 0.235 0.235 

ro 0.250 0.250 

ri - 0.313 

xi - 1.750 

yi - 11.375 

^1 - L355 

|Sii|,|S22| [dB] <=-14.6 <=-20 

Bandwidth [GHz] 8.0-11.75 7.78-11.72 



3.6 Conclusion 

In this chapter, several techniques for computationally efficient simulation-driven 
design optimization of microwave structures have been discussed. We also pre- 
sented a number of design examples concerning various microwave components, 
including microstrip filters, planar antennas, as well as transition structures. In all 
cases, the surrogate-based techniques presented in the previous chapter have been 
employed as optimization engines. The results presented here indicate that the sur- 
rogate-based optimization methods make the simulation-driven microwave design 
feasible and efficient, both in terms of the quality of the final design, and in terms 
of the computational cost. In most cases, the design cost corresponds to a few 
high-fidelity electromagnetic simulations of the microwave structure under con- 
sideration, typically comparable to the number of design variables. While this kind 
of performance is definitely appealing, improved robustness and reliability as well 
as availability through commercial software packages are needed to make the sur- 
rogate-based techniques widely accepted by microwave engineering community. 
Therefore, a substantial research effort in this area is expected in the years to 
come. 
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Chapter 4 

Airfoil Shape Optimization Using Variable- 
Fidelity Modeling and Shape-Preserving 
Response Prediction 



Slawomir Koziel and Leifur Leifsson 



Abstract. Shape optimization of airfoils is of primary importance in the design of 
aircraft and turbomachinery with computational fluid dynamic (CFD) being the ma- 
jor design tool. However, as CFD simulation of the fluid flow past airfoils is compu- 
tationally expensive, and numerical optimization often requires a large number of 
simulations with several design variables, direct optimization may not be practical. 
This chapter describes a computationally efficient and robust methodology for 
airfoil design. The presented approach replaces the direct optimization of an accu- 
rate but computationally expensive high-fidelity airfoil model by an iterative 
re-optimization of a corrected low-fidelity model. The shape-preserving response 
prediction technique is utilized to correct the low-fidelity model by aligning the 
pressure and skin friction distributions of the low-fidelity model with the corre- 
sponding distributions of the high-fidelity model. The algorithm requires one evalua- 
tion of the high-fidelity CFD model per design iteration. The algorithm is applied to 
several example case studies at both transonic and high-lift flow conditions. 



4.1 Introduction 

The use of optimization methods in the design process, as a design support tool or 
for design automation, has now become commonplace. In aircraft design, the de- 
velopment of numerical optimization techniques started in the mid 1970's when 
Hicks and Henne [1] used gradient-based optimization methods coupled with 
computational fluid dynamic (CFD) codes to design airfoils and wings at both 
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subsonic and transonic conditions. Substantial progress in gradient-based methods 
for aerodynamic design has been made since then. Jameson [2] introduced control 
theory and continuous adjoint methods to the optimal aerodynamic design for two- 
dimensional airfoils and three-dimensional wings. Initially, using inviscid flow 
solvers [3,4], and later using viscous flow solvers [5, 6]. 

The use of higher fidelity methods, coupled with optimization techniques, has 
led to improved design efficiency. However, simulation-driven aerodynamic de- 
sign optimization involves numerous challenges. In particular, the high-fidelity 
CFD simulations are computationally expensive (e.g., three-dimensional simula- 
tions of turbulent flows can take many days on a parallel computer), the design 
optimization normally requires a large number of simulations, and a large number 
of design variables are often involved. Therefore, direct optimization of the high- 
fidelity CFD model may be impractical, especially when using traditional gradi- 
ent-based techniques. 

Computationally feasible design exploiting CFD simulations can be realized us- 
ing surrogate-based optimization (SBO) techniques [7, 8]. One of the objectives of 
SBO is to reduce the number of evaluations of the high-fidelity models, and there- 
by making the optimization process more efficient. This is achieved by using com- 
putationally cheap surrogate functions in lieu of the CPU-intensive high-fidelity 
models. The surrogate models can be created either by approximating the sampled 
high-fidelity model data using regression (so-called function-approximation surro- 
gates), or by correcting physics-based low-fidelity models which are less accurate 
but computationally cheap representations of the high-fidelity models. 

A variety of techniques are available to create the function-approximation sur- 
rogate model, such as polynomial regression [7] and kriging [9]. Function- 
approximation models are versatile, however, they normally require substantial 
amount of data samples to ensure good accuracy. The physics-based surrogates are 
constructed by correcting the underlying low-fidelity models, which can be ob- 
tained through simplified physics models [10], coarse-discretization CFD simula- 
tion [1 1], or relaxed convergence criteria [12]. Popular correction methods include 
response correction [13] and space mapping [14]. 

The physics-based surrogate models are typically more expensive to evaluate 
than the function-approximation surrogates, but less high-fidelity model data is 
needed to obtain a given accuracy level. In many cases, SBO algorithms that utilize 
physics-based low-fidelity models — so-called variable- or multi-fidelity SBO — 
typically require only a single high-fidelity model evaluation per algorithm iteration. 
Due to this, the variable-fidelity SBO method is more scalable to larger numbers of 
design variables (assuming that no derivative information is required). A review of 
SBO methods popular in aerospace design can be found in [7] and [8]. 

In this chapter we describe a computationally efficient variable-fidelity airfoil 
shape optimization methodology [15, 16, 17], which employs physics-based low- 
fidelity surrogate models created by means of the shape-preserving response predic- 
tion (SPRP) technique [18]. Section 4.2 describes briefly the problem formulation 
for airfoil shape optimization. The optimization methodology is described in detail 
in Section 4.3. Application of the method to transonic and high-lift airfoil design is 
given in Sections 4.4 and 4.5, respectively. Section 4.6 summarizes the chapter. 



4 Airfoil Shape Optimization 101 

4.2 Airfoil Shape Optimization 

Out of a variety of design problems in aerodynamics, we focus here on airfoil 
shape optimization (ASO). Material concerning general aerodynamic optimiza- 
tion, CFD analysis, shape parameterization, and other relevant issues can be found 
in [26], Chapter 9. 

An airfoil is a streamlined aerodynamic surface such as the one shown in 
Fig. 4. 1 . The function of the airfoil is to generate a lift force Z at a range of operat- 
ing conditions (Mach number M^, Reynolds number Re, angle of attack a). The 
drag force increases quadratically with increasing lift. Normally, the drag force is 
to be minimized for a given lift. These forces are non-dimensionalized by divind- 
ing them by q^, where q^ = (\l2)pJ^J' is the dynamic pressure, p^^ is the air den- 
sity, V„ is the free-stream velocity, and 5 is a reference surface. After non- 
dimensionalization they are called the lift coefficient, denoted by Q, and the drag 
coefficient, denoted by Q. 

In direct ASO, the objective is to determine an airfoil shape that maximizes a 
performance criterion for a given set of constraints at a particular operating condi- 
tion. Usually, the lift coefficient is maximized, the drag coefficient is minimized, 
or the lift-to-drag ratio is maximized. For example, if the lift coefficient is maxi- 
mized, then a constraint is necessary on the maximum allowable drag coefficient. 
Further constraints are often included, e.g., to account for the wing structural 
components inside the airfoil one sets a constraint on the airfoil cross-sectional 
area. In inverse ASO, the airfoil shape is designed to attain a specific flow behav- 
ior which is defined a priori. Typically, a target airfoil surface pressure distribu- 
tion is prescribed. 

The optimization methodology described in this chapter is illustrated using the 
direct ASO approach. However, the method itself is more general and can also be 
applied to inverse airfoil design. 



Fig. 4.1 A single-element airfoil section of chord length c and thickness t. V„ is the free- 
stream velocity at an angle of attack a relative to the x-axis. / is the lift force (perpendicular 
to V„) and d is the drag force (parallel to V„o)- 
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4.3 Optimization Methodology 

In this section, we formulate the variable-fidelity airfoil optimization methodology 
that exploits the CFD-based low-fidelity models and the shape-preserving re- 
sponse prediction [18] methodology as the model correction tool. The NACA 
four-digit airfoil parameterization method is used due to its simplicity. The details 
of this parameterization, as well as the details of the CFD modeling methodology 
using the grid generator ICEM CFD [19] and the flow solver FLUENT [20] can be 
found in [26], Chapter 9. 



4.3.1 General Description 

The method follows the general principles of SBO [7, 8], as shown in Fig. 4.2, 
where the optimization burden is shifted to the low-cost surrogate model (referred 
to as s), whereas the high-fidelity model (referred to as f) is referenced occasion- 
ally for verification purposes and to obtain data necessary to update the surrogate. 
The surrogate is a corrected physics-based low-fidelity model (referred to as c). 

The low-fidelity model is corrected to become a reliable representation of the 
high-fidelity model. Normally, the figures of interest in the optimization, i.e., the 
objectives and constraints, are aligned between the high-fidelity and low-fidelity 
models using a correction procedure, e.g., space mapping [14]. However, in the 
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case of aerodynamic shape optimization, the figures of interest, such as the lift and 
drag coefficients, are scalars for a given operating condition and a given design 
vector X, which results in non-uniqueness of any alignment procedure that could 
be applied in order to match the low-fidelity model with the high-fidelity one, 
unless a sufficiently large amount of high-fidelity data is used in the model match- 
ing process. 

Here, the model alignment is performed using intermediate simulation results, 
more specifically, the pressure and skin friction distributions, whose dimensional- 
ity can be made as large as necessary by selecting sufficient number of control 
points along the airfoil chord. As the objectives and constraints are uniquely de- 
termined by the pressure and skin friction distributions, alignment of the corre- 
sponding distributions for the low- and high-fidelity models will result in an 
(unique) alignment of the figures of interest. The SPRP methodology [18] is 
adopted here for the alignment procedure. 

4.3.2 Surrogate Modeling Using Shape-Preserving Response 
Prediction 

The SPRP model is formulated here using the pressure distribution. The formula- 
tion for the skin friction part is analogous. We denote the pressure distributions for 
the high- and low-fidelity models as C,,/ and Cp,c, respectively. The surrogate 
model is constructed assuming that the change of Cp,f due to the adjustment of the 
design variables x can be predicted using the actual changes of Cpc- The change of 
Cp c is described by the translation vectors corresponding to certain (finite) number 
of its characteristic points on the pressure distribution. These translation vectors 
are subsequently used to predict the change of C,,/, whereas the actual C,,,/ at the 
current design, C^ /x*''), is treated as a reference. 

Figure 4.3(a) shows the pressure distribution Cpc of the low-fidelity model at 
X*'' = [0.02 0.4 0.12]^ (NACA 2412 airfoil) for M„ = 0.7 and a = 1 deg, as well as 
Cpc at X = [0.025 0.56 0.122]^; x^'' will denote a current design (at the ith iteration 
of the optimization algorithm; the initial design will be denoted as x*"' accord- 
ingly). Circles denote characteristic points of Cpc(x*''), here, representing, among 
others, x/c equal to and 1 (leading and trailing airfoil edges, respectively), the 
maxima of Cpc for the lower and upper airfoil surfaces, as well as the local mini- 
mum of Cpc for the upper surface. The last two points are useful to locate the pres- 
sure shock. Squares denote corresponding characteristic points for Cp.c(x), while 
small line segments represent the translation vectors that determine the "shift" of 
the characteristic points of Cp,c when changing the design variables from x*'' to x. 

In order to obtain a reliable prediction, the number of characteristic points has 
to be larger than illustrated in Fig. 4.3(a). Additional points are inserted in be- 
tween initial points either uniformly with respect to x/c (for those parts of the pres- 
sure distribution that are almost flat) or based on the relative pressure value with 
respect to corresponding initial points (for those parts of the pressure distribution 
that are "steep"). Figure 4.3(b) shows the full set of characteristic points (initial 
points are distinguished using larger markers). 
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Fig. 4.3 (a) Example low-fidelity model pressure distribution at the design x'"', Cpd'^ ) 
(solid line), the low-fidelity model pressure distribution at other design x, Cp.c(x) (dotted 
line), characteristic points of Cp^(x*'*) (circles) and CpJ.'s.) (squares), and the translation vec- 
tors (short lines); (b) low-fidelity model pressure distributions, initial characteristic points 
(large markers) and translation vectors from Fig. 4.3(a) as well as additional points (small 
markers) inserted in between the initial points either uniformly with respect to xlc (for the 
"flat" parts of the pressure distribution) or based on the relative pressure value with respect 
to corresponding initial points (for the "steep" parts of the pressure distribution) 



The pressure distribution of the high-fidelity model at the given design, here, x, 
can be predicted using the translation vectors applied to the corresponding charac- 
teristic points of the pressure distribution of the high-fidelity model at x''\ Cp/x*'^). 
This is illustrated in Fig. 4.4(a) where only initial characteristic points and transla- 
tion vectors are shown for clarity. Figure 4.4(b) shows the predicted pressure dis- 
tribution of the high-fidelity model at x as well as the actual Cp/(x). The agree- 
ment between both curves is very good. 
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Fig. 4.4 (a) High-fidelity model pressure distribution at x*'', C,,.y(x''') (solid line) and the pre- 
dicted high-fidelity model C,, at x (dotted line) obtained using SPRP based on characteristic 
points of Fig. 4.3(b); characteristic points of Cp/x*'*) (circles) and the translation vectors 
(short lines) were used to find the characteristic points (squares) of the predicted high- 
fidelity model pressure distribution (only initial points are shown for clarity); low-fidelity 
model distributions Cp^i^ ) and C,, r(x) are plotted using thin solid and dotted line, respec- 
tively; (b) high-fidelity model pressure distribution at x, Cp/X) (solid line), and the pre- 
dicted high-fidelity model pressure distribution at x obtained using SPRP (dotted line) 



SPRP can be rigorously formulated as follows. Let Cpfx) 
Cp./x,y,„)]^and Cp,(x) = [c,,,r(x,>'i) ... c,,,.(x,}',„)]^, where jj,] = 1, .. 



= [Cp.pi-^yi) ... 
m, are control 

points on the xlc axis (we assume that y^+i > y, and < y, < 1 for all j). To simplify 
the notation we assume that Cpf (Cpc) is the pressure distribution for the upper sur- 
face only. Formulation for the lower surface is identical. Let p/ = [y/ r/Y, pf" = 
lyf" rf"]^, and pf = [yjc rf]^, j = 1, ..., K, denote the sets of characteristic points of 
C,,./(x*'^), Cp,r(x* ) and Cp,f(x), respectively. Here, y and r denote the x/c and magni- 
tude components of the respective point. The translation vectors of the low-fidelity 
model pressure distribution are defined as T, = [yj r/] ^,j= 1,..., K, where v/ = yf - 
yf and rj = r/ - r/ . 
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The SPRP surrogate model is defined as follows 

d;;^.44^ix,y,)...4M,yJ (4.1) 



where 



cW (X, yj ) = c^^f {x^'\F{yj , {-y', }f^, )) + r{yj,{4 }f=i ) (4.2) 



fory = 1, ..., m. c Ax, y) is an interpolation of (c^j/xji), ..., c^/x,}'™)} onto the 

interval [0,1]. The scaling function F interpolates the data pairs {3'i,}'i}, {yi,yi- 
yi'}, ..., {yK^yZ-yK}, {ym,ym}^ onto the interval [0,1]. The function r does a similar 
interpolation for data pairs [yun], {y/,r/-ri'}, ..., {yii,r,i-rK}, {ym,r„,}; here /i = 
Cp.cix,yi) - CpAx^yi) and r„ = c,,,(x,3'm) - c,,,,(x'',}'„). Note that Cp,/''(x*'') = 
Cp/''(x*'') as all translation vectors are zero at x = x*''. 

The prediction method assumes that the high- and low-fidelity model pressure 
distributions have corresponding sets of characteristic points. This is usually the 
case for the practical ranges of design variables because the overall shape of the 
distributions is similar for both models. In case of a lack of correspondence, origi- 
nal definitions of characteristic points are replaced by their closest counterparts. 
The typical example would be non-existence of the local minimum of the pressure 
distribution for the upper surface for the high- and/or low-fidelity model at certain 
designs. In this case, the original point (local minimum) is replaced by the points 
characterized by the largest curvature. 

4.3.3 Objective Function 

Due to unavoidable misalignment between the pressure distributions of the high- 
fidelity model and its SPRP surrogate at the designs other than the one at which 
the model is determined, i.e., x*'', it is not convenient to handle constraints (e.g., 
drag) directly, because the design that is feasible for the surrogate model, may not 
be feasible for the high-fidelity model. In particular, the design obtained as a result 
of optimizing the surrogate model CpJ'\ i.e., x*'"^'', will be feasible for Cj,J'\ How- 
ever, if x*'"^'' is not feasible for the high-fidelity model, it will not be feasible for 
q,./'""' because we have C,,,/'""'(x*'""') = C/./x*'"'") by the definition of the surro- 
gate model. In order to alleviate this problem, we shall use the penalty function 
approach to handle the constraints. 

More specifically, if the figure of interest is the lift coefficient, while the drag and 
the airfoil cross-sectional area are constraints, the objective function is defined as 

H(C^ (X)) = -C,, (C^ (X)) + 4aQ., (Cp (X))} + r[AA(x)]2 (4.3) 

where AQ, = if Q , < Ca.s.ima and AQ , = Q , - Ca.s.max otherwise, and AA = if 
A > A„i„ and AA = A - A„i„ otherwise. In the numerical experiments, presented 
in the next section, we use J3 = Y= 1000. Here the pressure distribution for the 
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surrogate model is Cp = Cps, and for the high-fidelity model Cp = Cpf. Also, Cu 
and Crf 5 denote the lift and drag coefficients (both being functions of the pressure 
distribution). 

4.3.4 Optimization with SPRP Surrogate 

The efficiency of variable-fidelity optimization with SPRP model is illustrated by 
airfoil design at M„ = 0.75 and a = 0°. The initial design is set as NACA 2412 
and the objective function is defined by Eq. (4.3), with Cd.s.nmx =0.0040 and A^m = 
0.075. The side constraints on the design variables are < m < 0.03, 0.3 </? < 0.6, 
and 0.09 <t <'0.13. Constraint tolerance bands are set to 5%. 

The high-fidelity model is based on the Euler equations and it is solved as 
described in [26], Chapter 9. The low-fidelity model is the transonic small- 
disturbance equation (TSDE) and it is solved using the computer code TSFOIL 
[21], which was developed at NASA in the 1970s. The code is capable of solving 
the TSDE for flow past lifting airfoils in both free air and various wind-tunnel en- 
vironments by using a finite-difference method and an iterative successive line 
over-relaxation (SLOR) algorithm. The computational grid is a simple, fixed Car- 
tesian grid. 

Five iterations of the SPRP-based design methodology were executed. The 
computational cost is 5 high-fidelity and 161 surrogate model evaluations. The 
surrogate model optimization is performed using the pattern-search algorithm [22]. 
The results are given in Table 4.1. As the surrogate model evaluates quite fast 
(about 1 to 3 seconds depending on the design) and the high-fidelity model 
evaluation takes a few minutes, the total cost of evaluating the low-fidelity model 
in the whole optimization run corresponds to roughly 1-2 evaluations of the high- 
fidelity model. The equivalent number of high-fidelity model evaluations is less 
than 7 for this particular case. 

Table 4.1 Numerical results of the design optimization. All the numerical values are from 
the high-fidelity model. N^ is number of low-fidelity model evaluations and A^^^ is the num- 
ber high-fidelity model evaluations. 



Variable 


Initial 


Direct' 


VF-SPRP' 


m 


0.0200 


0.0160 


0.0173 


P 


0.4000 


0.5999 


0.5930 


t 


0.1200 


0.1199 


0.1163 


c, 


0.4732 


0.4770 


0.5085 


Q 


0.0100 


0.0040 


0.0041 


A 


0.0808 


0.0808 


0.0783 


N 


N/A 





161 


K 


N/A 


130 


5 


Total cost 


N/A 


130 


<7 



Direct optimization of the high-fidelity model using the pattern-search algorithm [22]. 

* Design obtained using the methodology described here and the pattern-search algorithm [22] . 

* The total optimization cost is expressed in the equivalent number of high-fidelity model evaluations. 
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The optimizer achives this design by reducing maximum ordinate of the mean 
camber line (m) from 2% to 1.6%, and moving the location of the maximum camber 
(p) is rearward from 40% to 59% (which is close to the side constraint upper limit). 

By reducing the camber, the flow velocity decreases on the upper surface and the 
shock strength is reduced. This can be seen in Fig. 4.5. By moving the maximum 
camber rearward, the aft camber increases and the pressure distribution opens up 
behind the shock, where flow is subsonic, and lift is increased by 3.5 lift counts (one 
lift count is AC/ = 0.01). This can be seen in Fig. 4.6. The drag is reduced to satisfy 
the constraint limit. This is achived by reducing the thickness from 12% to 11.6%. 

The case was also performed by direct optimization of the high-fidelity model 
using the pattern-search algortihm. Direct optimization obtained a very simlar 
optimal design, but required 130 high-fidelity model evaluations. 
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Fig. 4.5 (a) Mach number contours for the initial design (NACA 2412) at M^ = 0.75 and 
a = 0°, (b) Mach number contours of the optimum design at the operating condition as 
in (a) 
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It should be emphasized that despite the encouraging results obtained for the 
considered test case, the use of the TSFOIL code may be problematic in general. 
In particular, TSFOIL would not converge to a valid flow solution for all designs 
in the solution domain. This may results in the convergence problems of the opti- 
mization algorithm. Furthermore, it turns out that the TSDE-based surrogate does 
not give a sufficiently reliable prediction of the drag coefficient. In particular, for 
small (local) changes of the design variables the low-fidelity model does not fol- 
low closely of the high-fidelity model. 

The conclusions drawn from this study are: 

• a proper low-fidelity model needs to be selected to provide a reliable pre- 
diction of the high-fidelity model, especially for small changes in the de- 
sign variables, 

• the low-fidelity model has to be reliable in terms of execution, and 

• the optimization algorithm should be endowed with suitable convergence 
safeguards. 

4.3.5 Optimization Algorithm 

In this section, we formulate the optimization algorithm exploiting the SPRP- 
based surrogate model and a trust-region convergence safeguard [23]. This algo- 
rithm is used to solve the transonic and high-lift airfoil design cases, presented in 
Sections 4.4 and 4.5. The algorithm flow can be summarized as follows: 

1. Set / = 0; Select A, (initial trust region radius); Evaluate Cp/x*"'); 

2. Setup SPRP model; 

3. Obtain x*'""' = argmin{Z < x < m, IIx - x<''ll < A : //(C,,,/'\x))}; 

4. Evaluate high-fidelity model to get Cp./x*''); 

5. If //(Cp./'^x*'-"')) < HiC,J'\x^'^)) accept x*'-""; Otherwise x^'-"' = x*'>; 

6. Update A; 

7. Seti = i-Hl; 

8. If termination condition is not satisfied, go to 2. 

The SPRP surrogate model is updated before each iteration of the optimization al- 
gorithm using the high-fidelity model data at the design obtained in the previous 
iteration. The trust-region parameter A is updated after each iteration, i.e., de- 
creased if the new design was rejected or the improvement of the high-fidelity 
model objective function was too small compared to the prediction given by 
the SPRP surrogate, or increased otherwise. Classical updating rules are used 
(see, e.g., [23, 24]). The algorithm is terminated if IIx*'""' - x<''ll < 0.001 or A. 
< 0.001. 
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Fig. 4.6 (a) Pressure distributions of the initial (solid) and optimized (dashed) airfoil 
shapes, (b) initial (solid) and optimized (dashed) airfoil shapes 

4.4 Transonic Airfoil Design 

In this section, the variable-fidelity optimization algorithm is applied to airfoil 
design at a steady transonic flow condition. The surrogate model optimization is 
performed using the pattern-search algorithm [22]. The results of the design meth- 
odology are compared to the results obtained through direct optimization of the 
high-fidelity model. 

4.4.1 Case Setup 

The initial design is set as NACA 3210 and the the objective function is defined 
by Eq. (4.3), with Cd.s.max =0.0041 and A„,„ = 0.065. The operating condition is M„ 
= 0.75 and a = 1°. The side constraints on the design variables are < m < 0.1, 0.2 
<p< 0.8, and 0.05 <t< 0.20. Constraint tolerance bands are set to 5%. 



4.4.2 Model Setup 

In this case, variable-resolution modeling is employed. The high- and low-fidelity 
models solve the Euler equations, but the low-fidelity model uses a coarser 
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computational mesh and relaxed convergence criteria. The particulars of the low- 
fidelity model mesh and convergence criteria are found by performing a paramet- 
ric study on a typical airfoil section. 

The NACA 2412 was selected for the parametric study. The Mach number is 
taken to be M„ = 0.75 and the angle of attack is set to a = 1 deg. First a fine mesh 
is developed with a total of 320 points in the y-direction, 180 points on the airfoil 
surface and 160 points in the wake behind the airfoil, with a total of 106 thousand 
cells. Then, the flow is solved to full convergence to get the reference values. The 
convergence history is shown in Fig. 4.7(a). The solver needed 216 iterations to 
reach a converged solution based on the residuals. However, the lift and drag 
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Fig. 4.7 (a) Convergence history of the simulation of the flow past the NACA 24 1 2 at M^ = 
0.75 and a = 1 deg., (b) convergence of the lift and drag coefficients. The converged values 
of the lift coefficient is C; = 0.67 and the drag coefficient is C^ = 0.0261. 
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coefficient values have reached a converged value after approximately 50 itera- 
tions, as can be seen in Fig. 4.7(b). Therefore, the number of iterations limit is set 
to 100 iterations in the subsequent steps. 

Subsequently, the number of mesh points was reduced. This was done in two 
steps. First, the number of mesh points in the y-direction and the number of mesh 
points behind the airfoil were halved in each step. Then, the number of mesh 
points on the airfoil surface was reduced incrementally. In each step, the pressure 
distribution was plotted. This was done so the overall number mesh points could 
be reduced as much as possible, without reducing the mesh density on the airfoil 
surface, so that the shock could be resolved adequately. 
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Fig. 4.8 (a) Pressure distributions for the first part of the parametric mesh study where the 
mesh points are reduced in the y-direction and in the wake behind the airfoil, (b) pressure 
distributions for the second part where the mesh points are reduced on the airfoil surface. 
The Mach number is M^ = 0.75 and the angle of attack is a = 1 deg. 
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The results of the first mesh reduction are shown in Fig. 4.8(a). In the first four 
steps the number of cells is reduced from 106 thousand to 8295, but the pressure 
distribution does not change significantly, aside in the region of the shock, where 
the shock has strengthened and moved aft by less than 2.5% of the chord length. 

This has led, however, to a significant increase in the estimation of the drag co- 
efficient (+23.7%), as can be seen in Fig. 4.9(a), and a moderate increase in the 
lift coefficient (+2.7%). The evaluation time has been reduced from 470 s to 40 s 
(Fig. 4.9(b)). In the last step the number of mesh points in the y-direction is re- 
duced to only 12 and the total number of cells is 3750. Now, there is a large 
change in the shock strength and location, but the pressure distribution is also 
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Fig. 4.9 (a) Lift and drag coefficients as a function of the number of cells for the parametric 
mesh study, (b) evaluation time of the CFD simulation model as a function of the number 
of cells. The Mach number is M„ = 0.75 and the angle of attack is a = 1 deg. 
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altered in the front part the airfoil, leading to a large increase in the drag coeffi- 
cient and a reduction in the lift coefficient. 

The fourth mesh was selected for the second mesh reduction. The results are 
shown in Fig. 4.8(b). The number of mesh points on the airfoil surface was re- 
duced by 50 in the first two steps (meshes 6 and 7) and then by 20 (mesh 8). It is 
clear, as the mesh gets coarser on the airfoil surface, the shock is smeared over 
a larger area and the estimated shock strength is reduced. As can be seen from 
Fig. 4.9 (a), both the drag and lift coefficients are reduced in this process. The 
overall evaluation time is reduced to about 34 s in the last step (Fig. 4.9(b)). 

The second but last mesh (number 7) was selected as a basis to construct the 
low-fidelity model. The mesh has 48 points in the y-direction, 115 points on the 
airfoil surface, and 20 points in the wake behind the airfoil, with a total of 8295 
thousand cells. The reason for selecting this particular mesh is that the difference 
in evaluation time is insignificant between the last two meshes (7 and 8), but the 
difference in the shock is quite substantial: it is easier to correct the low-fidelity 
model if the difference between it and the high-fidelity model is smaller. 

For the airfoil considered in this parametric study, the overall evaluation time for 
the low-fidelity model using the above mentioned mesh, and an iteration limit of 
100, is about 35 s, which is approximately 13.5 times faster than the high-fidelity 
model using the fine mesh and traditional convergence criteria. The criteria used in 
this work for the high-fidelity model is a maximum residual of 10'*, or a maximum 
number of iterations of 1000. The overall evaluation time of the high-fidelity model 
in this parametric study is 471 s with a total of 216 iterations. In many cases the 
solver does not fully converge with respect to the residuals and goes on up to 1000 
iterations. Then the overall evaluation time goes up to 2500 s, and the low-fidelity 
model is approximately 73 times faster. Note that the evaluation times reported here 
includes the time required for connecting to twice to the license server, once for the 
grid generator, ICEM CFD [19], and once for the flow solver, FLUENT [20]. 

4.4.3 Results and Discussion 

The optimization method presented here was able to meet the design goals and 
yield the optimized design — within the given constraint bands — using 330 low- 
fidelity model evaluations and 1 1 high-fidelity model evaluations (Table 4.2). The 
equivalent number of high-fidelity model evaluations is less than 18 (using the ra- 
tio of the high-fidelity model evaluation time to the corrected low-fidelity model 
as 50). The direct method obtained a similar optimized design, but required 120 
high-fidelity model evaluations. 

To meet the design goals, the optimizer does three fundamental shape changes: 
(i) the maximum ordinate of the mean camber line (m) is reduced or kept constant, 
(ii) the location of the maximum ordinate of the mean camber line (p) is moved 
aft, thus increasing the trailing-edge camber, and (iii) the thickness (/) is reduced. 
Shape changes (i) and (iii) reduce the shock strength and, thus, reduce the drag 
coefficient. The associated change in the pressure distribution reduces the lift co- 
efficient. However, shape change (ii) improves (or recovers a part of) the lift by 
opening up the pressure distribution behind the shock. These effects are clearly 
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demonstrated in the pressure distribution plot in Fig. 4.10(a), the airfoil shape 
plots in Fig. 4.10(b), and the Mach contour plots in Figs. 4.1 1(a) and 4.1 1(b). 

Table 4.2 Numerical results for lift maximization while keeping drag below a desired value 
at M^ = 0.75 and a= \ deg. All the numerical values are from the high-fidelity model. A^^. 
and Nf are the numbers of low- and high-fidelity model evaluations, respectively 



Variable 


Initial 


Direct' 


VF-SPRP' 


m 


0.0300 


0.0080 


0.0090 


P 


0.2000 


0.6859 


0.6732 


t 


0.1000 


0.1044 


0.1010 


c, 


0.8035 


0.4641 


0.4872 


c. 


0.0410 


0.0041 


0.0040 


A 


0.0675 


0.0703 


0.0680 


N 


N/A 





330 


K 


N/A 


120 


11 


Total cost 


N/A 


120 


<18 



' Direct optimization of the high-fidelity model using the pattern-search algorithm [22] . 

Design obtained using the algorithm described in Section 4.3; surrogate model optimization per- 
formed using the pattern-search algorithm [22]. 

The total optimization cost is expressed in terms of the equivalent number of high-fidelity model 
evaluations. The ratio of the high-fidelity model evaluation time to the corrected low-fidelity model 
evaluation time varies between 13.5 to 73 depending on the design. We use a fixed value of 50 here. 
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Fig. 4.10 (a) Pressure distribution of initial (solid) and optimized (dashed) airfoils, (b) ini- 
tial (solid) and optimized (dashed) airfoil shapes. The IVIach number is M„ = 0.75 and the 
angle of attack is a = 1 deg. 
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Fig. 4.11 Mach contour plots of (a) the initial airfoil, (b) the optimized airfoil. The Mach 
number is M^ = 0.75 and the angle of attack is a = 1 deg. 

The variable-resolution modeling exploited in this study exhibits consistent be- 
havior, i.e., the changes of the pressure distribution (and, consequently, the figures 
of interest such as lift and drag) of the low-fidelity model closely follows that of 
the high-fidelity one. This was not the case in the example in Section 4.3.4, where 
variable-fidelity physics modeling was exploited with the Euler equations and 
TSDE. 



4.5 High-Lift Airfoil Design 



In this section, design optimization of a single-element airfoil at steady subsonic 
high-lift condition is considered. As before, the surrogate model optimization is 
performed using the pattern-search algorithm [22] and the results of the design 
methodology are compared to the results obtained through direct optimization of 
the high-fidelity model. 
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4.5.1 Case Setup 

The Mach number is set to M„ = 0.2, the angle of attack is a = 12 degree, and the 
Reynolds number is Re = 2.3 million. The initial design is set as NACA 0012 and 
the the objective function is defined by Eq. (3), but with the skin friction 
distribution Cf included, as viscous effects are important at this condition. The 
maximum allowable drag is set to Ca.s.max = 0.0212. The area constraint is not used 
here. The side constraints on the design variables are < m < 0.08, 0.3 <p< 0.6, 
and 0.08 <t< 0.14. Constraint tolerance bands are set to 5%. 

4.5.2 Model Setup 

The high-fidelity model / solves the RANS equations with the Spalart-Allmaras 
one equation turbulence model [25]. The details of the CFD model are as de- 
scribed in [26], Chapter 9. The low-fidelity model c is constructed as a low-order 
polynomial approximation of the high-fidelity model data, i.e., both the pressure 
distribution Cpf and the skin friction distribution Cff. The low-fidelity model is es- 
tablished in the entire design using evaluations of/ at the following seven designs: 
x° = [0.04 0.45 0.11]^ (center point), and x' = [0.0 0.45 0.11]^, x^ = [0.08 0.45 
0.11]^, x^ = [0.04 0.3 0.11]^ x"* = [0.04 0.6 0.11]^, x^ = [0.04 0.45 0.08]^ x*" = 
[0.044 0.45 0.14]^ (single-variable perturbations for all design variables). The 
low-fidelity model is defined as a reduced quadratic model (no mixed terms) 

c(x) = Aq+ \m + A2P + A^t + \m^ +A^p^+ ?^t^ (4.4) 

where the coefficients X are found by solving the linear system c(x') = /(x'), 7 = 0, 
1, ...,6. 

The reason for choosing the approximation-based model c is that the pressure 
distribution does not change significantly for the design space considered in this 
case. The simple model (4.4) is a reasonable compromise between the accuracy 
and the computational cost of creating the response surface. Still, the low-fidelity 
model has to be corrected in order to become a reliable representation of the high- 
fidelity one in the optimization process. 

Figure 4.12 shows the construction of the SPRP model for the high-lift airfoil 
design. When compared to the transonic case, the pressure distribution is simpler 
(no pressure shock). However, the figures of interest (particularly drag) are very 
sensitive to the changes of the distribution, so that much attention has to be put to 
detailed "description" of the distribution through SPRP characteristic points, par- 
ticularly for xlc close to zero, where the pressure gradients with respect to xlc are 
large. The pressure distributions of the low-fidelity model are illustrated in 
Fig. 4.12, at X*'' = [0.01 0.40 0.09]^ for M„ = 0.2 and a = 10°, as well as q,., at x = 
[0.02 0.35 0.10]^. The pressure distribution of the high-fidelity model at the given 
design, here, jc, is predicted using the translation vectors applied to the corre- 
sponding characteristic points of the pressure distribution of the high-fidelity 
model at x*'\ Cp/x*'*). This is illustrated in Fig. 4.13. The predicted pressure distri- 
bution (magnified parts only) of the high-fidelity model at jc as well as the actual 
Cp/x) is shown in Fig. 4.14. 
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4.5.3 Results and Discussion 

The optimization method of Section 4.3 with the low-fidelity model (4.4) im- 
proves the lift coefficient from 1.235 to 1.491 (h-25.6 lift counts) by increasing 
camber by 2.34% and moving the location of maximum camber more aft, from 
0.45 to 0.60, which is the upper bound (Table 4.3). The thickness is increased 
from 12% to 14% (which is the upper bound). A comparison of the initial and op- 
timized airfoil shapes is given in Fig. 4.15(c). The optimized design is achieved by 
using only 1 1 high-fidelity CFD evaluations. The direct optimization method re- 
quired 65 high-fidelity CFD evaluations and improves the lift only by 1 1.3%. 




Fig. 4.12 Example low-fidelity model pressure distribution at the design x*'', CpcOi. ) (solid 
line), the low-fidelity model pressure distribution at other design x, C,,.£.(x) (dotted line), 
characteristic points of Cp.c(x' ) (circles) and Cpdx) (squares), and the translation vectors 
(short lines). Only selected points and vectors are shown for the sake of clarity of the pic- 
ture. Selected parts of the distributions are magnified. 
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0.01 0.02 0.03 
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Fig. 4.13 High-fidelity model pressure distribution at x' , Cp/(x*'') (solid line) and the pre- 
dicted high-fidelity model C,, at x (dotted line) obtained using SPRP based on characteristic 
points of Fig. 4.12; characteristic points of Cpy(x*'*) (circles) and the translation vectors 
(short lines) were used to find the characteristic points (squares) of the predicted high- 
fidelity model pressure distribution 
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xlc 

Fig. 4.14 Higli-fidelity model pressure distribution at x, C,,.;(x) (solid line), and the pre- 
dicted high-fidelity model pressure distribution at x obtained using SPRP (thick dotted 
line). Low-fidelity model pressure distribution at x is shown using a thin dashed line. SPRP 
model ensures better accuracy than the low-fidelity model. 

Table 4.3 Numerical results for lift maximization while keeping drag below a desired value 
at M^ = 0.2, a = 12 deg, and Re = 2.3 million 



Variable 


Initial 


Direct* 


VF-SPRP* 


m 


0.0000 


0.0150 


0.0234 


P 


0.4500 


0.4840 


0.6000 


t 


0.1200 


0.1247 


0.1400 


c, 


1.235 


1.392 


1.491 


Q 


0.0212 


0.0212 


0.0210 


Design cost 


N/A 


65 


11 



Direct optimization of the high-fidelity CFD model using the pattern-search algorithm [22]. 
Design obtained using the algorithm described in Section 4.3; surrogate model optimization per- 
formed using the pattern-search algorithm [22]. 



There are three major changes in the optimized design when compared to the 
initial one. First of all, the increased camber opens up the pressure distribution 
over the whole airfoil, as can be seen in Fig. 4.15(a), and thus the lift increases. 

Also, the aft camber opens the pressure distribution up near the trailing-edge, 
also increasing lift. Finally, the increased thickness reduces the pressure peak near 
the leading-edge, thus creating a milder expansion around the leading-edge, and 
thereby reducing pressure drag. The result is an optimized airfoil with improved lift 
coefficient at the same drag coefficient. A comparison of the lift and drag curves is 
given in Fig. 4.16. Although the airfoil was optimized at a = 12°, the entire lift 
curve is shifted upwards. In this case, the angle of attack at maximum lift increases 
slightly (approximately by 1°). 

Figure 4.17 shows the optimization history. In particular, one can observe a con- 
vergence plot (Fig. 4.17(a)), as well as the evolution of the objective function 
(Fig. 4.17(b)), the lift coefficient (Fig. 4.17(c)) and the drag coefficient 
(Fig. 4.17(d)). It follows that the algorithm exhibits a good convergence pattern and 
that the mechanisms introduced in the algorithm (in particular the trust region ap- 
proach and the penalty function) enforce the drag limitation to be satisfied while 
increasing the lift coefficient as much as possible. 
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(b) 




(c) 



Fig. 4.15 Comparison of initial and optimized designs at M^ = 0.2, a = 12 deg, and Re = 2.3 
million; (a) Pressure distributions of the initial and optimized designs, (b) skin friction dis- 
tributions of the initial and optimized designs, (c) initial and optimized airfoil shapes 



4 Airfoil Shape Optimization 



121 



2.0r 

1.5 
1.0 
0.5 
O" 0.0 
-0.5 
-1.0 
-1.5 



-2.0 







— •— Initial 
- --■--Optimized 


i f J ,,-"^"' 


1 1 

1 1- 

1 1 
1 1 
1 1 


1 \ y \ y 1 1 
1 1 w^ jgr 1 1 


1 1 y-^ \ \ 


1 1 Xy^ ^ ' ' ' 

1 r -y^ ^J* 1 ' 1 1 

1 ^' y^\ 1 1 1 1 

1 ^m^ y^ 1 1 1 1 1 


tr'.^ 


1 1 1 1 1 1 1 
1 1 1 1 1 1 1 
1 1 1 1 1 1 1 



-20 



0.10 
0.09- 
0.08 
0.07 
0.06 
O^ 0.05 
0.04 
0.03 
0.02 
0.01 



0.00 



-20 



-15 



-10 




a[°: 

(a) 



-15 



-10 




a[°] 

(b) 



10 



15 



20 









-t- 


— •— Initial 
--■■- Optimized 












1 1 1 1 1 


1 1 1 1 1 1 p 


--U--T---j---t-r-y-Tf- 


\ 1 1 1 1 1 f 


'"^^\"'\"'',,::^\"" 





10 15 



20 



Fig. 4.16 A comparison of the lift and drag curves of the initial and optimized designs at 
M„ = 0.2 and Re = 2.3 million 
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Fig. 4.17 Optimization history; (a) convergence plot; (b) evolution of the objective func- 
tion; (c) evolution of the lift coefficient; and (d) evolution of the drag coefficient (drag con- 
straint marked using a solid horizontal line). The graphs show all high-fidelity function 
evaluations performed in the optimization. 



4.6 Summary 



A variable-fidelity airfoil design optimization algorithm has been presented. The 
algorithm uses a computationally cheap low-fidelity model to construct a surro- 
gate of an accurate but CPU-intensive high-fidelity model. The low-fidelity model 
is corrected by aligning the airfoil surface pressure distribution with the corre- 
sponding distribution of the high-fidelity model by means of the shape-preserving 
response correction prediction technique. This ensures a good generalization ca- 
pability of the surrogate model with respect to both objectives and constraints. The 
robustness of the algorithm is enhanced by embedding it in the trust region 
framework. Applications for transonic and high-lift airfoil design are demon- 
strated with the optimized designs obtained at the computational cost correspond- 
ing to of a few high-fidelity model evaluations. 
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Chapter 5 

Evolutionary Optimisation Techniques to 
Estimate Input Parameters in Environmental 
Emergency Modelling 

Kerstin Wendt, Monica Denham, Ana Cortes, and Tomas Margalef 



Abstract. Parameter estimation in environmental modelling is essential for input pa- 
rameters, which are difficult or impossible to measure. Especially in simulations for 
disaster propagation prediction, where hard real-time constraints have to be met to 
avoid tragedy, the additionally introduced computational burden of advanced global 
optimisation algorithms still hampers their use in many cases and poses an ongoing 
challenge. In this chapter we demonstrate how modifications of a Genetic Algorithm 
(GA) are able to decrease time-consuming fitness evaluations and hence to speed up 
parameter calibration. Knowledge from past observed catastrophe behaviour is used 
to guide the GA during various phases towards promising solution areas resulting 
in a fast convergence. Together with parallel computing techniques it becomes a 
viable estimation approach in environmental emergency modelling. Encouraging 
results were obtained in predicting forest fire spread. 

Keywords: input parameter estimation, environmental modelling, knowledge- 
guided Genetic Algorithm, forest fire spread prediction. 

5.1 Introduction 

There exist many environmental models for simulating, explaining, and predict- 
ing the behaviour of complex natural phenomena. These comprise models which 



Kerstin Wendt • Ana Cortes • Tomas Margalef 

Departament d'Arquitectura de Computadors i Sistemes Operatius, 

Escola d'Enginyeria, Universitat Autonoma de Barcelona, 

08193 Bellaterra (Barcelona), Spain 

email: kerstin . wendtOcaos .uab. es, 

email: {ana . cortes , tomas .margalef }@uab . es 

Monica Denham 

Facultad de Ingenieria, Universidad Nacional de Rio Negro, 

Sede Andina, San Carlos de Bariloche, Argentina 

X.-S. Yang & S. Koziel (Eds.): Computational Optimization & Applications, SCI 359, pp. 125 J-143.| 
springerlink.com © Springer- Verlag Berlin Heidelberg 2011 



126 K. Wendt et al. 

simulate standard events and processes in meteorology, oceanography, groundwater 
hydrology, and petroleum reservoirs, as well as more specific models which attempt 
to simulate environmental emergencies such as floods, hurricanes, oil spills, forest 
fires, or volcanic eruptions. In either case, the precision of simulation output heavily 
depends on the quality of entered input parameter values. One cannot expect cor- 
rect results if the entries fed into the simulator were erroneous. Special applications 
such as propagation predictions of natural catastrophes require the most reliable 
simulation outcomes to prevent tragedy. Furthermore, these computationally inten- 
sive applications have to fulfil stringent real-time constraints to be of use during an 
ongoing disaster. 

The need for input parameter estimation and calibration to improve model output 
is a long-known and often-tackled problem, particularly in environments where cor- 
rect and timely input parameters cannot be provided 1121 1241 1401 . Therefore, compu- 
tational parameter estimation and optimisation strategies are required to minimise 
the deviation between the predicted scenario and the real phenomenon behaviour. 
Since input parameter calibration adds a significant computational effort to the sim- 
ulation process, a fast and efficient approach is the more important for time-critical 
applications. 

Many approaches for parameter calibration mainly use standard numerical op- 
timisation techniques, e.g. Kalman filter [.22 J . principal differential analysis I.34J . 
which are not fully capable of handling high dimensionality, nonlinearity, and ir- 
regularities contained in environmental models f41 1. Bayesian calibration methods, 
including Monte Carlo sampling, as well as stochastic practices, e.g. Simulated An- 
nealing and Genetic Algorithms, are numerically very intensive, but generally de- 
liver good results and tend to find global optima. With the continuous increases in 
computing power, these calibration methods, especially Genetic Algorithms (GA), 
have become practicable to solve the parameter problem of environmental mod- 
els. High performance, parallel, and distributed computing now enable the gener- 
ation of tractable solutions JT] [iTl to expensive and time-consuming optimisation 
problems. 

In this chapter, we summarise how a hybrid GA approach introduces problem- 
specific knowledge into different phases of the GA and is able to boost its perfor- 
mance. In doing so, online parameter estimation in time-critical applications can be 
provided. The real case of forest fire spread prediction is chosen to demonstrate how 
the employment of past observed or simulated disaster behaviour stored in a knowl- 
edge base speeds up parameter calibration for environmental emergency models. 

The remainder of this work is organised as follows. The next section gives an 
overview of modelling environmental emergencies and explains details about the 
prediction of forest fire spread. Afterwards, the input parameter estimation problem 
is characterised. In section 15.31 we describe the implementation of a parallelised 
GA for parameter estimation. The benefits of applying domain- specific knowledge, 
including knowledge representation, retrieval and insertion, are outlined. Experi- 
mental results are shown in section [?!4l and section |53] comprises main conclusions 
and briefly discusses future work. 
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5.2 The Input Parameter Problem in Environmental 
Emergency Modelling 

Environmental emergencies include natural events and also human-induced acci- 
dents, that are able to cause severe environmental damage, e.g. loss of ecological 
resources, air pollution, erosion, water contamination, climate changes, extinction 
of species, as well as destruction of buildings and infrastructure and, maybe most 
importantly, loss of human lives. Where environmental emergencies cannot be pre- 
vented, their prediction is crucially important. To avoid tragedy, most environmental 
emergency management systems include effective tools to forecast the propagation 
of an ongoing event. Based on these forecasts combined with expert user experience, 
disaster warnings are issued and catastrophe fighting actions are decided. There is 
no doubt about the importance of quick and most reliable forecasts to minimise the 
number of victims, the amount of damage caused, and the employed resources for 
disaster combatting. For prediction purposes mainly computer modelling and simu- 
lation applications are used to forecast the state of the event for a given time in the 
near future. 

Basic disaster propagation models for the different emergencies are available 
since the early seventies, e.g. CLIPER for hurricane track prediction f3T\, Rother- 
mel's model for forest fire spread behaviour ll36ll . and are subject to continu- 
ous enhancements, e.g. NAME III for volcanic ash dispersion ||2T|| . Many envi- 
ronmental models and simulators originate from research activities and are often 
bound to a specific geographic region or particular vegetational characteristics, e.g. 
PROMETHEUS, a wildfire growth simulator designed to work in Canadian boreal 
forest 1 38 1, or HFire, a rasterbased model for fire behaviour through Southern Cal- 
ifornia chaparral I.30J . To become established in the scientific community, a model 
should preferably be generally applicable. If the model itself and, moreover, the 
simulator which implements the model, furthermore fulfil certain end user require- 
ments, e.g. graphical user interface, GIS integration, user support and training, mod- 
elling of special disaster occurrences as crown fires or underground peat fires, it is 
most likely to be applied in daily use in disaster management centres, e.g. FARSITE 

5.2.1 Forest Fire Spread Prediction 

There exists a large variety of wildfire behaviour and spread simulators (e.g. FAR- 
SITE ini, BehavePlus, fireLib 0, NEXUS) and most of them are based on the 
Rothermel equation model (3U\. The two main approaches to propagate fire spread 
are based on a regular grid system (cellular automata) and on continuous planes (el- 
liptical wave propagation). Fire simulators can be used as stand-alone applications 
for risk analysis, disaster evolution prediction and fire fighter training. Furthermore, 
they form a fundamental part of complex decision support systems (DSS), which 
are typically applied to monitor environmental emergencies such as forest fires. 
The simulators traditionally work with a set of input parameters describing the 
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environmental conditions of the region where the fire takes place including veg- 
etational, climatological and topographical characteristics. Simulators differ in re- 
quired model parameters and input and output data formats. The classic way of 
predicting forest fire behaviour takes the initial state of the fire front (RF = real fire) 
as input as well as the input parameters given for time tx- The simulator then returns 
the prediction (SF = simulated fire) for the state of fire front at a later time t-^+i as 
shown in figure ISTTl 
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Comparing the simulation result SF from time tx+i with the advanced real fire 
RF at the same instant, the forecasted fire front tends to differ to a greater or lesser 
extent from the real fire line because the calculation of the simulated fire is based 
upon a single set of input parameters afflicted with certain insufficiencies. These are 
explained in section [F.2.2l To enable real-time calibration of model input parameters 
in each time step during an ongoing prediction, a simulator independent two-stage 
data-driven prediction scheme was proposed by Abdalhaq et al. in 1 1 1. Introducing 
a previous calibration step as shown in figure 15.21 the set of input parameters is 
refined before every prediction step. A similar two-stage data assimilation scheme 
for ecological modelling has been proposed by Zhu et al. in |44| and delivered 
promising results. 

The objective is to solve an inverse problem: Find a parameter configuration 
such that, given this configuration as input, the model output matches real disaster 
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Fig. 5.2 Two-stage data-driven forest fire prediction scheme 
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behaviour. Having detected the simulator input that describes best current environ- 
mental conditions, the same values, it is argued, could also be used to describe 
best the immediate future assuming stable meteorological conditions during the fol- 
lowing prediction interval. Thus, the prediction becomes the result of a series of 
automatically adjusted input configurations. 

Initially, as an optimisation technique during the calibration step, Abdalhaq et 
al. proposed to employ a Genetic Algorithm (GA) in 1 1 1. Different combinations of 
input parameter values (scenarios) are generated, evolved, and evaluated. Compar- 
ing the simulated fire front at time r^+i to the real fire front at the same time the 
quality of the corresponding parameter set can be obtained. Lastly, the best fitting 
scenario is selected to serve as input for the following prediction phase. On the one 
hand, the data-driven prediction scheme significantly enhances the quality of input 
parameters and overall prediction results as proven by Denham et al. in |9|. On 
the other hand, it introduces a significant additional amount of computational effort 
and consequently increases prediction runtime in a non-negligible way. In real-time 
disaster modelling applications parameter estimation time needs to be reduced to 
a practicable minimum. In section 15.31 we will give reason why a modified GA is 
suited as parameter estimation technique in environmental modelling and explain 
implementation details. 

5.2.2 The Input Parameter Problem 

Inaccuracy and uncertainty in the normally large number of input parameters are 
known and serious problems in environmental modelling leading to unreliable prop- 
agation predictions in disaster modelling. Model output is particularly sensitive to 
those parameters which have a direct impact on model simulation but can not be 
well determined by direct observations. In disaster propagation models input pa- 
rameters are often difficult or even impossible to measure in practice and hence are 
always incomplete and uncertain. Many parameters are highly dynamic and subject 
to frequent spatiotemporal changes in the microclimate generated by a disaster (e.g. 
strong wind gusts in forest fires). Recent advances in measurement technologies, 
remote sensing and power supply for sensors f3T| help to remove part of this un- 
certainty, but installation and maintenance of sufficient sensors remain expensive 
and therefore a fundamental hindrance in large areas, which are sparsely populated 
and difficult to access. Instead, input parameter values for an upcoming real-time 
prediction are initialised with current weather and area forecast data provided by 
discrete meteorological stations. These might include measurements for uncommon 
parameters (e.g. fuel moisture) with temporal resolutions too low to be of use during 
an ongoing hazard prediction. 

Assuming observed parameter values with sufficient precision became available 
as the prediction advances, simulators often work with a static set of input param- 
eters not considering changes in parameter values over time. While some of the 
available simulators offer integration and support for GIS data, fewest tools present 
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abilities for real-time data assimilation of altered meteorological data during a run- 
ning simulation. In consequence, the prediction error accumulates gradually and 
simulation outcomes will deviate from the true state when the model is run for an 
increased prediction period t44ll . A workaround to this problem might be the inclu- 
sion of the simulator into external frameworks as proposed by Rodriguez in II 35)1 . 
Nevertheless, the correct or real parameter value set might not result in the best 
overall simulation output as explained in [2]. Simulation errors are not only due to 
uncertainties in input, but are also the product of model errors (overly simplified 
description of the natural system) and computation errors (truncation and rounding 
problems). It is therefore common practice to apply the concept of best fitting in- 
put and search for input parameters in the way that they produce the best overall 
simulation result. 

Additionally to imprecision in input parameters, grid-based propagation models 
have to deal with spatial uncertainty 118|. When real-time constraints have to be 
met, low resolution representations of the region under consideration are preferable 
to reduce simulation runtime. Consequently, the contained heterogeneous environ- 
ments (e.g. different vegetation models) might not be mapped with the necessary 
degree of exactness. Furthermore, some theoretical model parameters, e.g. arbitrary 
empirical values, might miss a readily-identifiable counterpart in reality according 
toGl. 

These observations strongly recommend to apply parameter estimation in en- 
vironmental emergency modelling, but up-to-date only the minority of models and 
simulators includes estimation techniques by default. Hence, the work with outdated 
field measures, estimated, extrapolated and missing values remains generating un- 
satisfactory results and poses an ongoing optimisation challenge. This is why we 
propose a general and simulator independent approach to enhance the prediction of 
disaster propagation by using a knowledge-guided GA for input parameter calibra- 
tion, always following requirements are met: (1) The range of each input parameter 
is known and defined, (2) The sensitivity index of each model parameter is known 
or at least the most sensitive input parameters are identified, (3) Information about 
the true state of the disaster is available in reasonable time intervals, and (4) There 
exists an initial knowledge base (KB) containing information about past disasters. 

5.3 Parameter Estimation with Knowledge-Guided Genetic 
Algorithm 

Annan and Hargreaves state in [2|, that the estimation of input parameters in high 
dimensional models is an inherently intractable problem. Moreover, there is proba- 
bly no general solution to the problem that will work efficiently in all applications, 
though there exist efforts in developing universal and model-independent parame- 
ter estimation frameworks lfT2ll33l . Building on that, our focus is on developing 
methods, which aim to achieve acceptable levels of precision by taking advantage 
of characteristics that exist in disaster propagation prediction. 
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Various approximations and heuristic metiiods have become popular over the 
years and are now established standards in the scientific community. Especially Ge- 
netic Algorithms, together with Monte Carlo sampling and ensemble Kalman filter, 
are widely applied as an approach to solve global optimisation problems in all areas. 
GA are a search heuristic and imitate the process of natural evolution 1 15|. GA are 
part of the larger class of evolutionary algorithms (EA) and they repetitively apply 
the methods elitism, selection, crossover, mutation, fitness calculation, and reinser- 
tion. In accordance with |^1 1, GA can be an effective tool for parameter optimisation 
in environmental modelling and have, like other population-based global-search ap- 
proaches, desirable properties: They are able to rapidly locate good solutions, even 
for large search spaces, and especially useful in problem domains that have a com- 
plex fitness landscape. GA are therefore applicable and widely-used to solve the 
parameter problem of environmental models 181 1141 I25II29II32II . 

5.3.1 Parallel Implementation of Hybrid Genetic Algorithm 

The major obstacle to utilise parameter estimation practices in time-critical simu- 
lations is, first of all, due to the enormously increased computational burden. Most 
computational time required for calibrating parameters in complex environmental 
models is spent running the model code and generating the desired output. GA ex- 
ecution requires repeated fitness function evaluations with often very expensive ob- 
jective functions. In addition, most environmental modelling problems possess a 
large quantity of input parameters creating a vast search space to be explored by the 
GA. Relying on fitness approximation methods summarised in [20. ^8 1 can be one 
possibility to tackle this problem and remains an animated research field. Further 
criticisms include the complex parameter tuning of GA and their tendency to con- 
verge towards local optima or even arbitrary points rather than the global optimum 
in many problems. 

During an ongoing disaster, short simulator response time is a key characteristic 
to cope with real-time capabilities and that simulation outcomes can be of use. To 
make evolutionary parameter estimation methods generally applicable to real-time 
environmental emergency modelling, possibly even granting feasible solutions in 
case of limited computational resources or high-resolution prediction maps, some 
modifications and enhancements are needed to speed up GA runtime. 

Firstly, instead of using a sequential GA implementation, it is indispensable to 
make use of high performance computation and apply a parallel GA to substantially 
reduce global computing time. GA are extremely parallelisable and Denham imple- 
mented a parallel GA version based on the master/worker paradigm in ifTOll . MPI 
||T6J libraries manage communication between the configurable number of worker 
nodes. Data parallelism was chosen and divides the individuals of the GA popu- 
lation into chunks and distributes them to available workers, which carry out the 
corresponding fitness evaluations. The pseudo code of the parallelised implementa- 
tion split up into master and worker node operations can be found in ITOll . 
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The GA used for parameter calibration in the data-driven forest fire prediction 
works on a population x = [xi,X2,.--,Xp] consisting of p individuals (chromosomes) 
Xk = [xki , Xk2 , . . . , Xjij] ,yk=l <k < p . Each individual of the population represents 
a simulator input parameter set (scenario) made up of d parameter values, e.g. fuel 
model, slope, fuel moistures and wind characteristics. The genes x/^j e [lbj,ubj] 
of each individual are encoded as real values within a previously defined range, 
where lbj,ubj ^3l,\/k = 1 < k < p,j = 1 < j < d and Ibj being the lower bound 
of parameter values for gene j and ubj the upper bound for gene j. Individuals of 
the initial population are randomly generated within given lower and upper bounds. 
We apply elitism to keep the most promising individuals throughout the number of 
generations and guarantee fast and smooth convergence of the GA. Roulette wheel 
selection and one-point crossover are employed. Details on the mutation operator 
are given in section 13". 3. 41 

The goodness of the generated scenarios is evaluated by a problem dependent 
fitness function. In the fire prediction context, the fittest scenario is the one that 
generates a simulated fire map the most similar to the real map of fire propagation. 
To determine the fitness of each scenario, an error function ( 15.11 ) based on a cell-by- 
cell comparison of the affected terrain is applied putting into relation the erroneous 
burnt cells of the simulated fire with all really burnt cells. 

(U — initiaLfire) — (f) — initialJire) 

errors T^ ■ ■,■ . ^ (5-1) 

reaLfire — mitiaLfire 

with U resulting in the number of cells burnt in one or both real and simulated fire 
map and f] denominating the number of cells burnt in both real and simulated fire 
map is to be minimised during the optimisation process. For every individual x^ 
in the population of the GA a simulation has to be run to be able to compute the 
goodness of this individual, which makes the fitness evaluation process particularly 
time-consuming and expensive, as explained at the beginning of this section. 

Secondly, to further speed up GA execution and reduce the number of fitness 
evaluations to a ble minimum, decreasing the number of individuals in the popu- 
lation could be considered. This approach normally leads to an unwanted loss of 
population diversity and is therefore not recommended. On the contrary, population 
size should be proportional to problem size and therefore increase in high dimen- 
sional optimisation problems to cover most parts of the search space. 

Thirdly, to provide near-optimal solutions with sufficient precision in early gen- 
erations, i.e. yield short convergence times of the GA without compromising the 
good search quality, we embed domain knowledge into various phases of the GA to 
quickly guide it towards promising solution areas. This technique, together with an 
adapted mutation probability, not only manages to accelerate GA execution, but also 
prevents the algorithm from converging towards arbitrary points. The architecture of 
the resulting hybrid GA is shown in figure l53] and our proposals for representation, 
retrieval, and insertion of problem-specific knowledge are outlined in the following 
subsections. 
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Fig. 5.3 Architecture of the hybridised GA guided with problem-specific knowledge 



5.3.2 Knowledge Representation 

To make proper use of expert knowledge during GA execution, we maintain a 
knowledge base (KB), which contains information on the behaviour of past for- 
est fires. It associates model input parameter sets with their model outcome. More 
precisely, data is stored in form of simulator output parameters determining fire 
propagation, e.g. spread direction and speed, together with the causing environmen- 
tal conditions, e.g. slope, fuel model, wind characteristics, as inputs to the simulator. 
The majority of these parameters are continuous, few are nominal. This approach 
does not require the complete set of input parameters to be stored, but best results 
are obtained if the most sensitive and dynamic parameters are available. 

Where data of real emergencies was not available, synthetical fires were simu- 
lated to ensure that the knowledge base contains a considerable number of input 
configurations and covers the input parameter space to the highest degree possible. 
At the moment, there are more synthetical fires present in the KB than informa- 
tion on real catastrophes, as correct and reliable data is difficult to obtain. But due 
to improved observation technologies and documentation possibilities via satellites 
and remote sensing, reasonable growth and refinement of knowledge is expected in 
the near future. The KB was implemented as a standard relational database, which 
is essential if it reaches a reasonable size. We can thus take advantage of database 
management techniques enhancing flexibility in general knowledge management 
and retrieval. 



5.3.3 Knowledge Retrieval 

It is necessary to find the corresponding knowledge in an efficient manner to 
avoid a new increase in parameter estimation runtime. For this purpose, model out- 
put parameters, which describe the current fire's behaviour (maximum fire spread 
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direction and speed), are obtained from the real fire map RF at time f^+i once in 
every calibration step before starting the GA. A clustering algorithm takes these 
values together with available static terrain characteristics as arguments to query the 
KB. It aims to find the most similar past fire behaviours and returns their causing 
meteorological conditions, i.e. the query result delivers the input parameter values 
from forest fires that performed comparably to the ongoing event. The encountered 
values are then gathered in a knowledge chromosome kc = [kci,kc2, ■•■,A:q/] which 
is stored temporarily and thus available in every phase during each iteration of the 
GA. 

The clustering algorithm is based on a A:-Nearest Neighbour search and delivers 
the k input parameter scenarios that are most likely to cause current wildfire propa- 
gation. Presently, k is set to 1 . We apply a distance function derived from the Het- 
erogeneous Euclidean-Overlap Metric (HEOM) |43 1 to measure similarity between 
individual events. This distance measure takes into account differences between 
nominal and linear parameters: For nominal data a value-mate hing-based metric 
is defined and linear parameters are compared with the normal Euclidean metric in- 
cluding range normalisation to avoid that parameters with large ranges overpower 
those with smaller ranges. 

The sensitivity analysis conducted by Abdalhaq et al. in 1 1] shows that fire spread 
is influenced to varying degrees by the existing input parameters. The parameters 
which mainly affect fire propagation are wind speed, wind direction and slope char- 
acteristics. In order to correctly reflect this sensitivity of inputs and to obtain mean- 
ingful results, in |42| Wendt et al. added HEOM a parameter importance ranking by 
means of a weight factor w^. The distance measure finally results in 



WHEOM{x,y) = 



\ 



^Wkhkixt-ykY (5.2) 

k=l 



where hj^ stands for the two mentioned metrics. 

Knowledge retrieval is performed in two steps. First, the SQL mechanism identi- 
fies the stored forest fire observations related to the current fire, i.e. fires that present 
the most similar slope and fuel model characteristics compared to the ongoing disas- 
ter. The second step evaluates the similarity of the retrieved observations to the fire 
under prediction applying WHEOM. Thus, we can avoid that distance calculation 
is executed on the complete KB dataset, but only performed on the result set of an 
intermediary range query. 

5.3.4 Knowledge Insertion 

In recent years, it proved of value for certain problems in different areas to hybridise 
GA by introducing problem-specific knowledge lfm[T9ll23ll26ll27l . A special in- 
terest in the use of non-random mutation operators could be observed. In general, 
the quality of the optimisation is remarkably better when domain knowledge is in- 
corporated in the problem solving process as mentioned by Q. Up-to-date, few 
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general principles, guidelines and best practices on how to incorporate which type 
of knowledge into GA exist and the efficiency of this method is mainly proven by 
experiments. This is why current investigation tries to generalise the introduction of 
domain knowledge in evolutionary algorithms |l3l|4l|6l. 

Using historical incidences can be a good choice to treat uncertainty and miss- 
ing values. The approach of using information about past experiences furthermore 
results fully legitimate and reasonable as it tries to imitate behaviour and experi- 
ence of human system experts. Like domain experts applying their knowledge of 
observed phenomenon behaviour to rectify decision support from automated pre- 
diction systems when forecasting a new disaster, we investigated the injection of 
domain- specific knowledge into the GA. In doing so, a faster convergence of the 
GA towards fitter solutions can be reached and the risk of parameter estimation 
becoming the bottleneck of the overall prediction process is further diminished. 

5.3.4.1 Knowledge Insertion during Mutation 

In nature, mutation occurs very infrequently and can often result in a weaker indi- 
vidual. Occasionally, the result might be to produce a stronger one. In GA, mutation 
is an operator that changes the information contained in one or more gene values 
in an individual according to the defined mutation probability. This probability usu- 
ally is set fairly low. Directing the mutation process towards promising zones in the 
search space and transforming randomness into controlled variation, a significantly 
increased mutation probability up to 0.4 should be considered. 

The most common mutation operator for real-valued parameters is the uniform 
mutation that replaces the value of a chosen gene Xf;j with a uniform random value 
selected from the problem-specific parameter range between lower and upper bound 
[lbj,ubj]. Our guided mutation approach uses the domain knowledge contained in 
the knowledge chromosome kc to narrow valid ranges of parameter values. These 
values then oscillate in their smaller limits, finally, forcing the GA to adopt spe- 
cific values in certain dimensions from which we know that they will increase an 
individual's fitness. The three steps to follow for each gene during mutation are: 

1. Preparation 

Compute the mutation probability for the gene by associating a random number 
from the interval [0, 1] with the gene. The gene is mutated if the associated num- 
ber is less than the specified mutation rate. 

2. Knowledge insertion 

If knowledge is available for this gene in the knowledge chromosome kc, then 
re-define the gene's range of valid parameter values. Set the new lower bound 
Ib^rj to 

Ibj-j^kcj-tj (5.3) 

and the re-defined upper bound ub_rj to 

ub-Tj ^kcj+tj, (5.4) 
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where tj is a configurable threshold that can be chosen independently for every 
gene. If necessary, the bounds have to be repaired after knowledge insertion in 
order to obtain a consistent program: If Ibj^j < Ibj then reset Ib-rj ~ Ibj, and if 
ub-Tj > ubj then reset ub-rj — ubj. 

3. Mutation 

If the gene needs to be mutated the gene is modified choosing a random value 
from the original range, or, if applicable, from the narrowed range of valid values. 

5.3.4.2 Knowledge Insertion during Population Initialisation 

Further to direction mutation, population initialisation is an obvious phase during 
GA execution to be supported with available problem-specific knowledge. Accord- 
ing to a predefined initialisation probability, part of the individuals could be seeded 
in areas where optimal solutions are likely to be found. Knowledge injection follows 
the same three steps as described for mutation. 

Again, during guidance, a valid subset of the original range is chosen by adding 
a threshold tj to the value of retrieved knowledge, instead of using the raw value 
without further modification. The knowledge retrieval process only returns parame- 
terisations most similar to the real event and depending on the degree of detailedness 
of the information in the KB, the configuration of the real fire might be found or not. 

At present, we employ knowledge for wind speed and wind direction and cut the 
ranges for these two parameters because they are highly dynamic and model output 
proved extremely sensitive to them. The present thresholds used as margin are 5 deg 
for wind direction and 2 mph for wind speed. There exist other factors influencing 
fire spread (e.g. fuel moisture), which are, though less determining, also frequently 
changing and difficult to measure and it should therefore also be considered to guide 
their values during GA execution. 

5.4 Experimental Evaluation 

The experimentation's objective is to prove the benefit of introducing domain- 
specific knowledge. If inserted to guide the GA towards promising solution areas, 
we can significantly reduce the number of performed fitness evaluations while main- 
taining the prediction error magnitude. Experiments are divided into three different 
scenarios, which were all executed using the forest fire spread simulator fireLib HI. 
Presented results are based on a series of real fire plots performed as prescribed 
burns in the frame of the European SPREAD project f39l in Gestosa (Portugal) in 
2004. Figure I5T4I shows aerial photographs of three selected plots. Fire front evolu- 
tion through time of the used fire maps can be seen in figure 1575] The specific plot 
characteristics are outlined in table l5.1l In order to get around arbitrary evolution ef- 
fects and to get more descriptive results, all presented experiments are the averaged 
outcome of ten different initial populations. The first scenario demonstrates the su- 
periority of population size and diversity over evolutionary methods. In the second 
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Fig. 5.4 Selected plots from SPREAD project: (a) Plot 520 and (b) plot 533 and (c) plot 751 



Table 5.1 Characteristics of selected burns from SPREAD project 



Plot 



Width m 



Length m 



Slope 



Ignition type 



520 
533 
751 



95 

20 



91 

123 

30 



18° 
21° 
6° 



line ignition 
point ignition 
line ignition 



test scenario we show the effect of knowledge introduction into the GA during dif- 
ferent phases and the third experiment scenario illustrates how the same introduction 
of knowledge is able to reduce error variability. 

The first scenario illustrates the importance of population diversity. If numerous 
individuals are present in a randomly initialised population, these, by default, create 
a good coverage of the search space and evolutionary methods of the GA tend to 
have less influence. Figures 15.61 and 15.71 show the calibration step results for plots 
520 and 751. Populations of 50 and 500 individuals, respectively, were used and 
results were obtained applying the described guidance during mutation compared 
to a non-guided version for both population sizes. Mutation probability was set to 
0.2 when guiding, 0.01 otherwise. It can be observed, that best calibration results 
can be obtained in every case, if many individuals are available. Guidance, depend- 
ing on the characteristics of each plot, is only able to produce minor enhancements. 
As excessively increased runtime makes this practice unfeasible in real online ap- 
plications, an acceptable compromise is to reduce population size to a practicable 
minimum and guide the GA. Working with a relatively small population generates 
the highest calibration error in this experiment. But the introduction of knowledge 
is able to compensate a reduced number of individuals and to gain precision in 
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Fig. 5.5 Fire front evolution through time for (a) plot 520 and (b) plot 533 and (c) plot 751 



Fig. 5.6 Calibration step 
results for different GA 
configurations applied to 
plot 520 after five iterations 
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the majority of time steps. Predominant influence of population diversity is further 
supported by the fact that convergence of the GA is notably slower in bigger popu- 
lations as exemplarily shown in figure lSTSl for minute 12 to 14 from plot 520. Very 
fit individuals are already present in the initial population. 

The second scenario gives evidence that the introduction of problem-specific 
knowledge into a GA for input parameter estimation is able to clearly boost its per- 
formance in terms of runtime and, though to a minor extent, with regard to prediction 
quality. Figure [?!9l a) shows the input parameter refinement during calibration stage 
for plot 520 applying guided (guidance during mutation (0.2), population initialisa- 
tion (0.25), and mutation and population initialisation) and non-guided GA versions 
(population size 50). It can be observed that the incorporation of knowledge results 
in a comparable or smaller calibration error and requires less than half of execu- 
tion time. Using the optimised input parameter (the best individual after two or five 
generations, respectively) for the subsequent prediction step, we can note in figure 
I5.9f b) that the parameter obtained after fewer GA iterations supported by knowl- 
edge generates predictions with the same error magnitude as the parameter obtained 
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Fig. 5.7 Calibration step 
results for different GA 
configurations applied to 
plot 75 1 after ten iterations 




Fig. 5.8 Convergence for 
different GA configurations 
during calibration phase for 
plot 520 in minute 12 to 14 
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Fig. 5.9 (a) Calibration errors (after five (non-guided) and two (guided) GA iterations) and 
(b) prediction phase errors for plot 520 applying different GA configurations 



after five generations without supporting knowledge. Figures IS.lOf a) and (b) show 
a similar behaviour for plot 75 1 comparing the performance of the non-guided GA 
approach after ten iterations with the guided GA approach after five iterations. 

The third scenario analyses error variability of the GA approach in its guided and 
non-guided version. Figure ISTTTI plots the results for map 533 including calibration 
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Fig. 5.10 (a) Calibration errors (after ten (non-guided) and five (guided) GA iterations) and 
(b) prediction phase errors for plot 575 1 applying different GA configurations 



Fig. 5.11 Calibration step 
and prediction step error for 
different time steps of plot 
533 applying guidance dur- 
ing mutation (observations 
after 5 generations) com- 
pared to the non-guided GA 
(observations after 10 gen- 
erations) including observed 
error variability. 
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and prediction phase. In each of the two phases the mutation-guided version (results 
after five GA iterations) is compared to the non-guided version (results after ten GA 
iterations). Mutation probability was set to 0.4 when guiding, 0.01 otherwise. Aver- 
age errors are depicted as columns and observed minimum and maximum values are 
provided as bars. We can clearly observe that, during calibration phase, the guided 
GA delivers improved results after half of the number of GA iterations. In addition, 
it significantly reduces error variability. Results oscillate in a smaller error range 
compared to the non-guided approach. During prediction phase, the guided GA ver- 
sion again manages to decrease error range and produces comparable predictions. 
The introduction of knowledge thus helps to guarantee acceptable results with low 
variance after few GA iterations. 



5.5 Conclusions and Future Work 



In this chapter, we have presented an approach for input parameter estimation for 
real-time disaster modelling using the example of forest fire spread prediction. The 
calibration process takes advantage of domain-specific knowledge guiding a parallel 
GA during mutation and population initialisation towards promising solution areas. 
This notably speeds up the calibration of highly dynamic or unavailable model input. 
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The method is able to ensure that resulting parameter value sets oscillate in a certain 
error range. The gain in estimation speed helps the simulation application to cope 
with real-time requirements and to deliver predictions results for online decision 
support systems closer to reality. 

Future work aims to determine the benefit of including additional domain- 
dependent information (e.g. fire shape, geographical and seasonal information) into 
the KB and to prove the potential of the approach with a different fire simulator 
or another disaster application. Experiments with larger real plots are ongoing. The 
combination of GA with an intelligent paradigm (IP, e.g. fuzzy inference system, 
neural network) to form an Evolutionary Intelligent System delivered first promis- 
ing results and the application of further IP is currently investigated. 
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Chapter 6 

Harmony Search Algorithms in Structural 

Engineering 

M.P. Saka, I. Aydogdu, O. Hasancebi, and Z.W. Geem 



Abstract. Harmony search method is widely applied in structural design optimi- 
zation since its emergence. These applications have shown that harmony search 
algorithm is robust, effective and reliable optimization method. Within recent 
years several enhancements are suggested to improve the performance of the algo- 
rithm. Among these Mahdavi has presented two versions of harmony search 
methods. He named these as improved harmony search method and global best 
harmony search method. Saka and Hasancebi (2009) have suggested adaptive 
harmony search where the harmony search parameters are adjusted automatically 
during design iterations. Coelho has proposed improved harmony search method. 
He suggested an expression for one of the parameters of standard harmony search 
method. In this chapter, the optimum design problem of steel space frames is for- 
mulated according to the provisions of LRFD-AJSC (Load and Resistance Factor 
Design-American Institute of Steel Corporation). The weight of the steel frame is 
taken as the objective function to be minimized. Seven different structural optimi- 
zation algorithms are developed each of which are based on one of the above men- 
tioned versions of harmony search method. Three real size steel frames are 
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designed using eacli of tliese algorithms. The optimum designs obtained by these 
techniques are compared and performance of each version is evaluated. 

Keywords: Structural Optimization, Metaheuristic Techniques, Harmony Search 
Algorithm. 

6.1 Introduction 

Building construction causes use of large amount of land, consumption of energy 
and water. Furthermore, production of the materials used in the building construc- 
tion add large amount of pollution into the atmosphere. It is important to reduce 
the amount of natural resources utilized in building construction if we would like 
to reduce the environmental impact and have a sustainable development. Green 
construction is becoming a common practice all over the world which intends to 
construct buildings that are environmentally friendly and resource-efficient 
throughout its life cycle. This practice covers all the stages from setting to design, 
construction and operation. Overdesigns and use of excessive materials are not de- 
sired because they consume more of natural sources and add more pollution to the 
atmosphere. Hence it is clear that in order to have a sustainable development 
structures are required to be designed and built using sufficient amount of material 
but not more. Structural design optimization tools exactly try to achieve this goal. 
They aim to design the steel structures such that the steel frame has the minimum 
weight and in the mean time the response of the frame under the external loads 
that the frame may be subjected to during its life time is within the design code 
limitations. Design of steel structures has its own features and not similar to the 
design of other structures. Designer cannot use any section she/he may desire but 
to select among the set of steel profiles available in practice for beams and col- 
umns of the frame under consideration. This selection is required to be carried out 
such that the frame with the selected steel profiles should have the displacements 
less than those prescribed in the design code and its members have sufficient 
strength to satisfy the strength limitations under the external loads. In the mean 
time its cost is the minimum. 

In this chapter firstly the design optimization problem of steel space frames ac- 
cording to the provisions of LRFD-AISC (Load and Resistance factor Design- 
American Institution of Steel Corporation) [1] is presented. The weight of the steel 
frame is taken as the objective function to be minimized. Such formulation of the 
optimum design problem yields a discrete programming problem. The solution of 
this programming problem is obtained by harmony search algorithm [2-6]. This 
method is one of the recent combinatorial optimization techniques that belong to 
general class of what is called metaheuristic algorithms. Metaheuristic algorithms 
[7-11] finds the solution of optimization problems by utilizing certain tactics that 
are generally inspired from the nature, though not limited to, instead of classical 
procedures that move along the descending direction of gradient of objective func- 
tion. Harmony search method mimics music improvisation. Harmony search 
method is widely applied in structural design optimization since its emergence 
[12-19]. These applications have shown that harmony search algorithm is robust, 
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effective and reliable optimization method. Within recent years several enhance- 
ments are suggested to improve the performance of the harmony search method. 
Among these Mahdavi [20, 21] has presented two versions of harmony search me- 
thods. He named them as improved harmony search method and global best har- 
mony search method. Hasancebi et. al. (2010) [22, 23] has suggested adaptive 
harmony search where the harmony search parameters are adjusted automatically 
during design iterations. Coelho [24] has proposed improved harmony search me- 
thod. He suggested an expression for one the parameters of standard harmony 
search method. In this chapter seven different structural optimization algorithms 
are developed each of which is based on one of the above mentioned versions of 
harmony search method. Three steel space frames are designed using each of these 
algorithms. The optimum designs obtained by each of these techniques are com- 
pared and performance of each version is evaluated. 

6.2 Discrete Optimum Design of Steel Space Frames to 
LRFD-AISC 

The design of steel space frames necessitates the selection of steel sections for its 
columns and beams from a standard steel section tables such that the frame satis- 
fies the serviceability and strength requirements specified by the code of practice 
while the economy is observed in the overall or material cost of the frame. When 
the design constraints are implemented from LRFD-AISC the following discrete 
programming problem is obtained. 

6.2.1 The Objective Function 

The objective function is taken as the minimum weight of the frame which is ex- 
pressed as in the following. 

ng t^ 

Minimize W=J,m^Y.f-g (6-1) 

r = I .v = 1 

where; W defines the weight of the frame, m, is the unit weight of the steel section 
selected from the standard steel sections table that is to be adopted for group r. f^ is 
the total number of members in group r and ng is the total number of groups in the 
frame. 4 is the length of member s which belongs to group r. 

6.2.2 Strength Constraints 

For the case where the effect of warping is not included in the computation of the 
strength capacity of W-sections that are selected for beam-column members of the 
frame the following inequalities given in Chapter H of LRFD-AISC are required 
to be satisfied. 
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where, M^x is the nominal flexural strength at strong axis (x axis), M„y is the nomi- 
nal flexural strength at weak axis (y axis), M„ is the required flexural strength at 
strong axis (x axis), M„j, is the required flexural strength at weak axis (y axis), Pn 
is the nominal axial strength (tension or compression) and P„ is the required axial 
strength (tension or compression) for member i. I represents the loading case. The 
values of M^- and M„j, are required to be obtained by carrying out P - A analysis of 
the steel frame. This is an iterative process which quite time consuming. In Chap- 
ter C of LRFD-AISC an alternative procedure is suggested for the computations of 
Mux and M„y values. In this procedure, two first order elastic analyses are carried 
out. In the first, frame is analyzed under the gravity loads only where the sway of 
the frame is prevented to obtain M„, values. In the second, the frame is analyzed 
only under the lateral loads to find Mi, values. These moment values are then com- 
bined using the following equation as given in the design code. 



M,. 



BlMnt+B^ 



M, 



(6.4) 



where Bi is the moment magnifier coefficient and B2 is the sway moment magni- 
fier coefficient. The details of how these coefficients are calculated are given in 
Chapter C of LRFD-AISC [1]. 

Eqns. (6.2) and (6.3) represents strength constraints for doubly and singly sym- 
metric steel members subjected to axial force and bending. If the axial force in 
member k is tensile force the terms in these equations are given as: P^t is the re- 
quired axial tensile strength, Pni^ is the nominal tensile strength, ^ becomes ^i in 
the case of tension and called strength reduction factor which is given as 0.90 for 
yielding in the gross section and 0.75 for fracture in the net section, ^ is the 
strength reduction factor for flexure given as 0.90, M^-^. and M„yi. are the required 
flexural strength M^j. and M„j,j. are the nominal flexural strength about major and 
minor axis of member k respectively. It should be pointed out that required flex- 
ural bending moment should include second-order effects. LRFD suggests an ap- 
proximate procedure for computation of such effects which is explained in C 1 of 
LRFD. In the case the axial force in member k is compressive force the terms in 
Eqns. (6.2) and (6.3) are defined as: /"„<: is the required compressive strength, P„t 
is the nominal compressive strength, and ^ becomes (4; which is the resistance fac- 
tor for compression given as 0.85. The remaining notations in Eqns. (6.16) and 
(6.17) are the same as the definition given above. 

The nominal tensile strength of member k for yielding in the gross section is 
computed as P„j. = F^Agt where Fy is the specified yield stress and A^,/. is the gross 
area of member k. The nominal compressive strength of member k is computed as 
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^rf = Agt^„ where F^^ =(0.658'^'' V^, for 4 <1.5 and F„ =i^.illlPil)Fy for 



X^ > 1 .5 and A^, = — -il Vl . In these expressions E is the modulus of elastic- 
rn ^ /-c 

ity, K and I are the effective length factor and the laterally unbraced length of 
member k respectively. 

6.2.3 Displacement Constraints 

The lateral displacements and deflection of beams in steel frames are limited by 
the steel design codes due to serviceability requirements. According to the ASCE 
Ad Hoc Committee report [25], the accepted range of drift limits in the first-order 
analysis is 1/750 to 1/250 times the building height H with a recommended value 
of H/400. The typical limits on the inter-story drift are 1/500 to 1/200 times the 
story height. Based on this report the deflection limits recommended are proposed 
in [26, 27, 28] for general use which is repeated in Table 6.1. 

Table 6.1 Displacement limitations for steel frames 

Item Deflection Limit 

1 Floor girder deflection for service live load L/360 

2 Roof girder deflection L/240 

3 Lateral drift for service wind load H/400 

4 Inter-story drift for service wind load H/300 

6.2.3.1 Deflection Constraints 

It is necessary to limit the mid-span deflections of beams in a steel space frame 
not to cause cracks in brittle finishes that they may support due to excessive dis- 
placements. Deflection constraints can be expressed as an inequality limitation as 
shown in the following. 

8dj=— — 1^0 7=l,---«™, / = l,...,M;c (6-5) 

where, dji is the maximum deflection of/* member under the /* load case, ^"is 
the upper bound on this deflection which is defined in the code as span/360 for 
beams carrying brittle finishers, «,„ is the total number of members where deflec- 
tions limitations are to be imposed and rii^ is the number of load cases. 

6.2.3.2 Drift Constraints 

These constraints are of two types. One is the restriction applied to the top story 
sway and the other is the limitation applied on the inter-story drift. 
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Top Story Drift Constraint 

Top story drift limitation can be expressed as an inequality constraint as shown in 
the following. 



KJ, 



^^^^•-^7^^"^-' ^■ = i'-'">- ^ = ''- 



(6.6) 



where H is the height of the frame, nj,„p is the number of joints on the top story, m^ 
is the number of load cases, (A,„p)y; is the top story drift of the/* joint under /* load 
case. Ratio is a constant value given in ASCE Ad Hoc Committee report [25]. 



Inter-story Drift 

In multi-story steel frames the relative lateral displacements of each floor is re- 
quired to be limited. This limit is defined as the maximum inter-story drift which 
is specified as /Zj/Ratio where h^j is the story height and Ratio is a constant value 
given in ASCE Ad Hoc Committee report [25]. 

gidj= , ,J . -1^0 j = l,-;n,t, l=\, ni, (6.7) 

n,,„ / Ratio 



where «,, is the number of story, M/^. is the number of load cases and (A„;,)j/ is the 
story drift of the/* story under /* load case. 

6.2.4 Geometric Constraints 

In steel frames it is not desired that column section for upper floor should not have 
a larger section than the lower story column for practical reasons. Because having 
a larger section for upper floor requires a special joint arrangement which is nei- 
ther preferred nor economical. The same applies to the beam-to-column connec- 
tions. The W-section selected for any beam should have a flange width smaller 
than or equal to the flange width of the W-section selected for the column to 
which the beam is to be connected. These are shown in Fig. 6.1 and named as 
geometric constraints. These limitations are included in the design optimization 
model to satisfy practical requirements. Two types of geometric constraints are 
considered in the mathematical model. These are column-to-column geometric 
constraints and beam-to-column geometric limitations. 

6.2.4.1 Column-to-Column Geometric Constraints 

The depth and the unit weight of W sections selected for the columns of two con- 
secutive stores should be either equal to each other or the one in the upper story 
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should be smaller than the one in the lower story. These limitations are included in 
the design problem as inequality constraints as shown in the following. 



^cdi 



^cmi 



—^ — 1<0 i = 2 rij (6.8) 

1<0 i = 2 rij (6.9) 






m, 



/-I 



where «, is the number of stories, m, is the unit weight of W section selected for 
column story i, m,_i is the unit weight of W section selected for of column story 
(i-1), Di is the depth of W section selected for of column story i and Z),_i is the 
depth of W section selected for of column story (i-1). 

6.2.4.2 Beam-to-Column Geometric Constraints 

When a beam is connected to a flange of a column, the flange width of the beam 
should be less than or equal to the flange width of the column so that the connec- 
tion can be made without difficulty. In order to achieve this, the flange width of 
the beam should be less than or equal to (D - 2ti,) of the column web dimensions 
in the connection where D and 4 are the depth and the flange thickness of W sec- 
tion respectively as shown in Fig. 6.1. 

8bci=-^—^Tf^-l^0 i = l nj, (6.10) 

or 

M 



[bA . ■'^ (6.11) 



where iiji is the number of joints where beams are connected to the web of a col- 
umn, M,2 is the number of joints where beams connected to the flange of a column, 
Dei is the depth of W section selected for the column at joint ;, {t),c)i is the flange 
thickness of W section selected for the column at joint /, (B/)„ is the flange width 
of W section selected for the column at joint / and (B/)^, is the flange width of W 
section selected for the beam at joint ;. 

The above optimum design of steel space frames problem where the objective 
function is given in Eqn. (6.1) and the constraints are described from Eqn. (6.2) to 
Eqn. (6.11) is a combinatorial optimization problem of discrete optimization. This 
is because the solution of the problem necessitates the selection of appropriate 
steel sections for the beams and columns of the frame from W-sections list such 
that the objective function described in Eqn. (6.1) has the minimum value while 
the design constraints given in inequalities from Eqn. (6.2) to Eqn. (6.11) are satis- 
fied. Designer has to find out the suitable combination of W-sections that makes 
the frame weight minimum in the same time the design code provisions are all 
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satisfied. Here the selection of a W-section from an available steel profile list is 
carried out by choosing an integer number from a set which consist of integer 
numbers starting 1 to the total number of sections in the list. This integer number 
is the sequence number of that particular W-section. Hence the design solution is a 
set of integer numbers each of which represents the sequence number of W- 
section in the design pool. This is a combinatorial optimization problem [7,8]. 




Fig. 6.1 Beam column geometric constraints 

In last two decades a new kind of algorithms have emerged which make use of 
certain heuristics in order to explore a search space efficiently. These methods are 
called metaheuristic algorithms [7-1 1]. A metaheuristic is an iterative process with 
set of concepts that are used for exploring and exploiting the search space to de- 
termine the best solution among the alternative solutions. Metaheuristic algorithms 
are not problem specific, approximate and usually non-deterministic. It is impor- 
tant that there should be a dynamic balance between diversification and intensifi- 
cation in metaheuristic procedure. Diversification generally refers to the explora- 
tion of the search space and intensification refers to the exploitation of the 
accumulated search experience [7]. The balance between these two concepts is 
important so as not waste too much time in regions of the search space which does 
not possess high quality solutions while the algorithm can quickly find out the re- 
gions of high quality solutions. Some of the metaheuristic algorithms employ 
strategies that are inspired from nature. They simulate natural phenomena such 
as survival of the fittest, immune system, swarm intelligence and the cooling proc- 
ess of molten metals through annealing into a numerical algorithm. They are 
named according to the natural phenomena their search strategy is based such as 
evolutionary algorithms, immune system algorithm, particle swarm optimization 



6 Harmony Search Algorithms in Structural Engineering 



153 



and simulated annealing. Metaheuristic methods are non-traditional stochastic 
search and optimization methods and they are very suitable and efficient in finding 
the solution of combinatorial optimization problems. 

It is shown in the literature that harmony search method which is one of the re- 
cently-developed metaheuristic techniques is quite effective and robust in solving 
structural optimization problems [12-19]. Performance evaluation of seven meta- 
heuristic technique used in optimum design of pin jointed and rigidly jointed real 
size steel frames is carried in [16, 19]. In these studies it is shown that harmony 
search algorithm is quite successful stochastic search method and its performance 
in some problems is better than some other metaheuristic methods. Since its emer- 
gence, numbers of enhancements are suggested in order to improve the perform- 
ance of the standard harmony search method. In this chapter these improvements 
are employed in solving the optimum design problem of steel space frames de- 
scribed above and their performances are compared. 



6.3 Harmony Search Algorithms 

The harmony search algorithm (HS) is originated by Geem et al. [2]. The algo- 
rithm was inspired by using the musical performance processes that occur when a 
musician searches for a perfect state of harmony, such as during jazz improvisa- 
tion [2-6] . The analogy between finding a pleasing harmony in music and the op- 
timum solution in an optimization problem is illustrated in Figure 6.2. A musician 
always intends to procedure a piece of music with perfect harmony. On the other 
hand, the optimal solution of an optimization problem should be the best solution 
available to the problem under given objective and limited by constraints. Both 
processes aim at reaching the best solution that is the optimum. 




Fig. 6.2 Analogy between music improvisation and optimization [5] 
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6.3.1 Standard Harmony Search Algorithm 

Harmony search method imitates the improvisation process of a skilled musician. 
When a musician is improvising, he or she has three possible choices: (a) can play 
any tune from his or her memory; (b) can play something similar to aforemen- 
tioned tune by just adjusting pitch slightly; (c) can play a tune completely new. 
These three options are simulated in three components in harmony search method. 
These are usage of harmony memory matrix (H), pitch adjusting (par) and ran- 
domization. Before initiating the design process, a set of steel sections selected 
from an available profile list are collected in a design pool. Each steel section is 
assigned a sequence number / that varies between 1 to total number of sections 
( N.,g^ ) in the list. It is important to note that during optimization process selection 

of sections for design variables is carried out using these numbers. The steps of 
the algorithm are outlined in the following as given in [2] : 

6.3.1.1 Initialization of Harmony Memory Matrix 

A harmony memory matrix H given in Eqn. (6.12) is randomly generated. The 
harmony memory matrix simply represents a design population for the solution of 
a problem under consideration, and incorporates a predefined number of solution 
vectors referred to as harmony memory size ( hms ). Each solution vector (har- 
mony vector, I' ) consists of ng design variables, and is represented in a separate 
row of the matrix; consequently the size of H is {hmsxng) . 



H 







■ 4" 
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■• ^ng 


f^(I^) 


jhms 


jhms 


jhms 
■• 'ng 


^(l'""') 



(6.12) 



6.3.1.2 Evaluation of Harmony Memory Matrix 

(hms) solutions are then analyzed, and their objective function values are calcu- 
lated. The solutions evaluated are sorted in the matrix in the increasing order of 
objective function values, that is ^(l') < (^(I^) < . . .< ^(l'""') . 



6.3.1.3 Improvising a New Harmony 

In harmony search algorithm the generation of a new solution (harmony) vector is 
controlled by two parameters ( hmcr and par ) of the technique. The harmony 

memory considering rate ( hmcr ) refers to a probability value that biases the algo- 
rithm to select a value for a design variable either from harmony memory 
or from the entire set of discrete values used for the variable. That is to say, this 
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parameter decides in what extent previously visited favorable solutions siiould be 
considered in comparison to exploration of new design regions while generating 
new solutions. At times when the variable is selected from harmony memory, it is 
checked whether this value should be substituted with its very lower or upper 
neighboring one in the discrete set. Here the goal is to encourage a more explor- 
ative search by allowing transitions to designs in the vicinity of the current solu- 
tions. This phenomenon is known as pitch-adjustment in harmony search method, 
and is controlled by pitch adjusting rate parameter ( par ). In the standard algo- 
rithm both of these parameters are set to suitable constant values for all harmony 
vectors generated regardless of whether an exploitative or explorative search is in- 
deed required at a time during the search process. Accordingly a new harmony 

I =[Ii,l2,--^J'ne\ is improvised (generated) by selecting each design variable 

from either harmony memory or the entire discrete set. The probability that a de- 
sign variable is selected from the harmony memory is controlled by harmony 
memory considering rate (hmcr) . To execute this probability, a random number 

r. is generated between and 1 for each variable I. . If r. is smaller than or equal 

to hmcr , the variable is chosen from harmony memory. Otherwise, a random val- 
ue is assigned to the variable from the entire discrete set as shown in Eqn. (6.13). 



, \l'e\P r /• if n< hmcr 

Ij=\ ' ^" ' "'" ' -I (6.13) 

[/■e{l, ,ng} ifri>hmcr 

If a design variable attains its value from harmony memory, it is checked whether 
this value should be pitch-adjusted or not. In pitch adjustment, the value of a de- 
sign variable is altered to its very upper or lower neighboring value obtained by 
adding ± 1 to its current value. Similar to (hmcr) parameter, it is operated with a 
probability known as pitch adjustment rate (par) . If not activated by (par) , the 
value of the variable does not change as given in Eqn. (6.14). 

/,■ + bw if ri < par 

I'l ifri>par (6.14) 

where, bw is arbitrary distance bandwidth which is taken as 1 in the standard 
harmony search method. 

6.3.1.4 Update of Harmony Matrix 

After generating the new harmony vector, its objective function value is calcu- 
lated. If this value is better (lower) than that of the worst harmony vector in the 
harmony memory, it is then included in the matrix while the worst one is dis- 
carded out of the matrix. The updated harmony memory matrix is then sorted in 
ascending order of the objective function value. 
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6.3.1.5 Termination 

The steps 3.1.2 and 3.1.3 are repeated until a pre-assigned maximum number of 
cycles are reached. 

6.4 Various Harmony Search Algorithms 

Within the recent years, number of enhancements is suggested to standard har- 
mony search method in order to improve its performance. In this study, seven 
variations of harmony search algorithms are considered to determine the solution 
of the optimum design problem of steel space frames. These techniques are sum- 
marized in the following. 

6.4.1 Standard Harmony Search with Adaptive Error Strategy 
(SHSAES) 

This version is same as the standard harmony search method. It follows the same 
steps explained above. The only difference is that in addition to feasible solution 
vectors slightly infeasible solution vectors are also included in the harmony mem- 
ory matrix. The candidate solution vectors that violate one or more design con- 
straints slightly are also accepted as solutions due to the fact that they may possess 
some appropriate values for some of the design variables which can be used in 
pitch adjusting in the next iteration. It should be noticed that such candidate design 
vectors are allowed in the beginning phase of the design process but they are re- 
quired to be taken out from the harmony memory matrix towards final phases 
of design cycles. This achieved by using larger error value initially and then this 
value is adjusted during the design cycles according to the expression given 
below. 



Tol(i) = Tol^^ _^^n3x m^ (6 15) 



where, Tol(i) is the error value in iteration i, Tol^^y^ and Tol^^^^ are the maximum 

and the minimum error values defined in the algorithm respectively, iter^^^ is the 

maximum iteration number until which tolerance minimization procedure contin- 
ues. Equation (6.15) provides larger error values in the beginning of the design 
cycles and quite small error values towards the final design cycles. Hence when 
the maximum design cycles are reached the acceptable design vectors remain in 
the harmony memory matrix and the ones which do not satisfy one or more design 
constraints smaller than the error tolerance would be pushed out during the design 
iterations. 
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6.4.2 Standard Harmony Search with Penalty Function 
(SHSPF) 

In this application of the harmony search method also follows the steps of the 
standard harmony search technique. It only differs from the standard one in the 
acceptance of the candidate solution vectors. All the design vectors selected ran- 
domly are included in the harmony memory matrix regardless of whether they sat- 
isfy the design constraints in design problem or not. However, a penalty function 
is constructed as shown in the following. 

Wp=w{\ + CY (6.16) 

where, W„ is the objective function that contain the penalty and W is the original 

objective function which is taken as the minimum weight of the steel space frames 
as given in Eqn. (6.1). C is the total constraint violation value calculated from the 
sum of the values of constraints function violations as given in equation (6. 17). e 
is the penalty coefficient taken as 2. 



c=Y,^s+Y, ^d +X ^id +Z ^'d +X ^cd +X ^cc +X C' 



(6.17) 

he 



where, C^ , C^ , C,j , Q^ , Q.^ , C^.^.. and Q^ are the constraint functions viola- 
tions for strength, deflection, inter-story drift, top story drift, column-to-column 
depth and unit weight and beam-to-column geometric constraint functions given in 
inequalities (6.2), (6.3), (6.5), (6.6), (6.7), (6.8), (6.9), (6.10) and (6.11) 
respectively. In general form, constraint function violation is calculated as: 




if giiXj)^0 i = \,...nc 

if gi(Xj)>0 j = l,...ng 



(6.18) 



where, gj (x) is i* constraint function, X is the vector of design variables, nc is 

the total number of constraint functions and ng is the total number of member 
groups (the total number of design variables) in the optimum design problem. It is 
apparent from Eqn. (6.18) that feasible solutions will not be subjected to any pen- 
alty and their objective function value will be equal to the original objective func- 
tion value given in Eqn. (6.1). The harmony search method seeks solution vectors 
in the design space that have smaller objective function values and stores these in 
the harmony memory matrix during the design cycles. As a result those solution 
vectors that have larger objective function values are eliminated from the harmony 
memory matrix within the harmony search iterations. Towards the end of design 
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cycles only those solution vectors that do not have any penalty remains in the 
harmony memory matrix and among these that have the least weight represents the 
optimum solution. 

6.4.3 Adaptive Harmony Search with Penalty Function (AHSPF) 

In standard harmony search method there are two parameters known as harmony 
memory considering rate ( hmcr ) and pitch adjusting rate ( par ) that play an im- 
portant role in obtaining the optimum solution. These parameters are assigned to 
constant values that are arbitrarily chosen within their recommended ranges by 
Geem [2-6] based on the observed efficiency of the technique in different problem 
fields. It is observed through the application of the standard harmony search me- 
thod that the selection of these values is problem dependent. While a certain set of 
values yields a good performance of the technique in one type of design problem, 
the same set may not present the same performance in another type of design 
problem. Hence it is not possible to come up with a set of values that can be used 
in every optimum design problem. In each problem a sensitivity analysis is re- 
quired to be carried out determine what set of values results a good performance. 
Adaptive harmony search method eliminates the necessity of finding the best set 
of parameter values [22, 23]. It adjusts the values of these parameters automati- 
cally during the optimization process. The basic components of the adaptive har- 
mony search algorithm are outlined as follows. 

6.4.3.1 Initialization of a Parameter Set 

Harmony search method uses four parameters values of which are required to be 
selected by the user. This parameter set consists of a harmony memory size Qims), 
a harmony memory considering rate {hmcr), a pitch adjusting rate {par) and a 
maximum search number {N^^^). Out of these four parameters, {hmcr) and {par) 
are made dynamic parameters in adaptive harmony search method that vary from 
one solution vector to another. They are set to initial values of /zmcT*°' and pa/^^ 
for all the solution vectors in the initial harmony memory matrix. In the standard 
harmony search algorithm these parameters are treated as static quantities, and 
they are assigned to suitable values chosen within their recommended ranges of 
hmcre [0.70, 0.95] and par g [0.20,0.50] [2-6]. 

6.4.3.2 Initialization and Evaluation of Harmony Memory Matrix 

The harmony memory matrix is established randomly as explained in section 3.1.1 
which contains candidate design vectors for the optimum design problem under 
consideration. The structural analysis of each solution is then performed with the 
set of steel sections selected for design variables, and responses of each candidate 
solution are obtained under the applied loads. The objective function values of the 
feasible solutions that satisfy all problem constraints are directly calculated from 
Eqn. (6.1). However, infeasible solutions that violate some of the problem 
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constraints are penalized using external penalty function approach, and their 
objective function values are calculated according to Eqn. (6.19). 



-W 





f \ 


l + a 


z«. 




V I J _ 



(6.19) 



In Eqn. (6.19), (j) is the constrained objective function value, g,- is the j'-th prob- 
lem constraint value and a is the penalty coefficient used to tune the intensity of 
penalization as a whole. This parameter is set to an appropriate static value of 
« = 1 in the numerical examples. Finally, the solutions evaluated are sorted in the 

matrix in the descending order of objective function values, that is, (Z)(I ) < 



(^(I^)< 



.< (^(l'""-') . 



6.4.3.3 Generating a New Harmony Vector 

In the adaptive algorithm a new set of values is sampled for hmcr and par pa- 
rameters each time prior to improvisation (generation) of a new harmony vector, 
which in fact forms the basis for the algorithm to gain adaptation to varying fea- 
tures of the design space. Accordingly, to generate a new harmony vector in the 
algorithm proposed, a two-step procedure is followed consisting of (i) sampling of 
control parameters, and (ii) improvisation of the design vector. 

6.4.3.3.1 Sampling of Control Parameters 

For each harmony vector to be generated during the search process, first a new set 
of values are sampled for hmcr and par control parameters by applying a logis- 
tic normal distribution based variation to the average values of these parameters 
within the harmony memory matrix, as formulated in Eqns. (6.20 and 6.21). 



/, x/t \^l-{hmcr)' -j^./v(o,i) 

{hmcr) = \^ ;— .e ' ^ ' 

\ (hmcr) 



\ (par) 



J 



(6.20) 
(6.21) 



In Eqns. (6.20) and (6.21), (hmcr) and (par) represent the sampled values of 
the control parameters for a new harmony vector. The notation A'^(0,1) designates 
a normally distributed random number having expectation and standard devia- 
tion 1. The symbols (hmcr)' and (par)' denote the average values of control 
parameters within the harmony memory matrix, obtained by averaging the corre- 
sponding values of all the solution vectors within the H matrix, that is. 
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Zhms ; x~' '""* ^ X i 

(hmcr) > _ (par) 

ynmcr) = '=/ , {par) = ^'=1 (6.22) 

(hms) (hms) 

Finally, the factor y in Eqns. (6.20) and (6.21) refers to the learning rate of con- 
trol parameters, which is recommended to be selected within a range of [0.25, 
0.50]. In the numerical examples this parameter is set to 0.35. 

In this implementation, for each new vector a probabilistic sampling of control 
parameters is motivated around average values of these parameters (hmcr)' and 

{par)' observed in the H matrix. Considering the fact that the harmony memory 
matrix at an instant incorporates the best {hms) solutions sampled thus far during 
the search, to encourage forthcoming vectors to be sampled with values that the 
search process has taken the most advantage in the past. The use of a logistic nor- 
mal distribution provides an ideal platform in this sense because not only it guar- 
antees the sampled values of control parameters to lie within their possible range 
of variation, i.e., [0, 1], but also it permits occurrence of small variations around 
{hmcr)' and {par)' more frequently than large ones. Accordingly, sampled val- 
ues of control parameters mostly fall within close vicinity of the average values, 
yet remote values are occasionally promoted to check alternating demands of the 
search process. 

6.4.3.3.2 Improvisation of the Design Vector 

Upon sampling of a new set of values for control parameters, the new harmony 

vector I = [/j ,/2 ,..,/„?] is improvised in such a way that each design variable 

is selected at random from either harmony memory matrix or the entire discrete 
set. Which one of these two sets is used for a variable is determined probabilisti- 
cally in conjunction with harmony memory considering rate {hmcr) parameter 
of the solution. To implement the process a uniform random number r^ is gener- 
ated between and 1 for each variable /, . If r,- is smaller than or equal to 

{hmcr) , the variable is chosen from harmony memory in which case it is as- 
signed any value from the /-th column of the H matrix, representing the value set 
of the variable in {hms) solutions of the matrix (Eqn. 6.12). Otherwise (if 

rj > {hmcr) ), an arbitrary value is assigned to the variable from the entire design 
set. 



jk ^Ui'^Yllh-' '!""' i if n^ {hmcr)'' ^^ ^3) 

' \lh{l ,N,J ifri>{hmcr)'' 



6 Harmony Search Algorithms in Structural Engineering 161 

If a design variable attains its value from harmony memory, it is checked whether 
this value should be pitch-adjusted or not. In pitch adjustment the value of a de- 
sign variable ( /, ) is altered to its very upper or lower neighboring value obtained 
by adding ± 1 to its current value. This process is also operated probabilistically 
in conjunction with pitch adjusting rate (par) parameter of the solution, 

Eqn. (6.21). If not activated by (par) , the value of the variable does not change. 
Pitch adjustment prevents stagnation and improves the harmony memory for 
diversity with a greater chance of reaching the global optimum. 



[ ^i if n > (par) 

6.4.3.4 Update of the Harmony Memory and Adaptivity 

After generating the new harmony vector, its objective function value is calculated 
as per Eqn. (6.19). If this value is better (lower) than that of the worst solution in 
the harmony memory matrix, it is included in the matrix while the worst one is 
discarded out of the matrix. It follows that the solutions in the harmony memory 
matrix represent the best (hms) design points located thus far during the search. 
The harmony memory matrix is then sorted in ascending order of objective func- 
tion value. Whenever a new solution is added into the harmony memory matrix, 
the (hmcr)' and (par)' parameters are recalculated using Eqn. (6.22). This way 
the harmony memory matrix is updated with the most recent information required 
for an efficient search and the forthcoming solution vectors are guided to make 
their own selection of control parameters mostly around these updated values. It 
should be underlined that there are no single values of control parameters that lead 
to the most efficient search of the algorithm throughout the process unless the de- 
sign domain is completely uniform. On the contrary, the optimum values of con- 
trol parameters have a tendency to change over time depending on various regions 
of the design space in which the search is carried out. The update of the control 
parameters within the harmony memory matrix enables the algorithm to catch up 
with the varying needs of the search process as well. Hence the most advantages 
values of control parameters are adapted in the course of time automatically (i.e., 
by the algorithm itself), which plays the major role in the success of adaptive 
harmony search method discussed in the paper. 

6.4.3.5 Termination 

The steps 4.3.4 and 4.3.5 are iterated in the same manner for each solution sam- 
pled in the process, and the algorithm terminates when a predefined number of 
solutions ( A^uiax ) '^ sampled. 
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6.4.4 Improved Harmony Search (IHS, Mahdavi) 

Standard harmony search method uses fixed values for both pitch adjustment rate 
(par) and arbitrary distance width (bw). Prior to the application of the algorithm 
some appropriate values are selected for these parameters and they are kept the 
same until the end of the iterations. For example the value of the arbitrary distance 
bandwidth (bw) is taken as ±1 in the standard harmony search method, although 
some other value can also be used if preferred. It is also stated in the work of 
Mahdavi et. al. [20] that the use of for example small fixed values for pitch ad- 
justment rate (par) with large values of arbitrary distance bandwidth (bw) can 
cause considerable increase in the total number of iterations required to reach the 
optimum solution, resulting in an undesirable poor performance of the algorithm. 
In order to avoid such a poor performance of the algorithm they have suggested 
adaptive expressions for both of these parameters instead of fixed values. The val- 
ues of these parameters are adjusted dynamically by using Equations (6.25) and 
(6.26) during the harmony search iterations. However, the fixed value is used for 
the harmony memory consideration rate (hmcr) which is kept the same until the 
end of the iterations as in the standard harmony search method. 



par(i) = par„ 



[par^. 



max 



- par„ 



Iter„ 



-XI 



i = 1,2,..., Iter„ 



(6.25) 



where; par{i) is pitch adjusting rate at iteration i, par^^^ and /?arjjjj„ are the 
maximum and the minimum values of pitch adjusting rates, Iter^^^ is the maxi- 
mum iteration number. 



bw(i) = bw^^^ exp 



£n 




V 7 


V 


Iter^ax 


/ 



(6.26) 



where; bw(i) is bandwidth in iteration i, bw^^^ and bw^„ are the maximum and 
the minimum distance bandwidth, par^^^^ , par^^„ , bw^^y^ and bw^^„ are speci- 
fied prior to the application of the algorithm. They are taken as 0.5, 0.05, 1 and 5 
respectively. In this study this technique is used with adaptive error strategy ex- 
plained in section 4. 1 not with penalty function concept. 
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6.4.5 Global Best Harmony Search (GBHS, Mahdavi) 

Mahdavi [5] also suggested another enhancement which makes use of the concept 
of particle swarm optimizer [29, 30] to the improved harmony search method [20]. 
In particle swarm optimizer system, a swarm of particles fly through the search 
space. Each particle represents a candidate solution to the optimization problem. 
The position of a particle is influenced by its best position and also the position of 
the best particle in the swarm. The global best harmony search modifies the pitch 
adjustment step of the harmony search method similar to particle swarm opti- 
mizer. It replaces the arbitrary distance width (bw) altogether and adds a social 
dimension to the harmony search by selecting the best harmony for the new har- 
mony vector. The algorithm computes (par) from the equation (6.25) and but does 
not use (bw) at all. Instead, it employs the following equation to construct the new 
harmony vector, not the one given in Eq. (6.14). 

jne^v ^ jhest ^g 27) 

where; best is the index of the best harmony in the harmony memory matrix and k 
is the variable number randomly selected between 1 to ng which is the total num- 
ber of design variables in the optimum design problem. In this study this technique 
is used with adaptive error strategy explained in section 4.1 not with penalty 
function concept. 

6.4.6 Improved Harmony Search (IHSC, Coelho) 

In another enhancement to standard harmony search algorithm Coelho et al. [24] 
has suggested another adaptive expression given in Eqn. (6.28) for the pitch ad- 
justment rate (par). This version of harmony search method is also called im- 
proved harmony search method and it has the same steps of the standard harmony 
search algorithm with the exception that it changes the value of the pitch adjust- 
ment rate (par) each iteration with the value computed from the following 
equation. 



. ^ (HMVal ^... (0 - Mean (HMVal )) 

par (0 = par^„ + [par^^ - par^^ ) X — — r- (6.28) 

{HMVal ^^ (0 - HMVal ^„ (O) 



where; par(i) is the value of the pitch adjusting rate in iteration i, par^^^ and 
P'^'min ^^^ ^^^ maximum and the minimum values of pitch adjusting rates respec- 
tively, HMVal^^^ , //MVa/jjjjj, and Mean (HMVal) are the minimum, the maxi- 
mum and the mean values of objective function in the harmony memory matrix 
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respectively. The values of par^,^^ and par^^^ are taken as 0.99 and 0.01 respec- 
tively in this study. In this study this technique is used with adaptive error strategy 
explained in section 4. 1 not with penalty function concept. 

6.4.7 Dynamic Harmony Search (DHS) 

Dynamic harmony search is suggested in this study. This version of the harmony 
search method has the same steps of the standard harmony search algorithm with 
the exception that instead of using fixed values for both parameters of harmony 
memory considering rate (hmcr) and pitch adjusting rate (par) their values are cal- 
culated by means of adaptive expressions. The value of (hmcr) is computed from 
Eqn. (6.20) and (par) is calculated from Eqn. (6.28). In other words dynamic har- 
mony search method is a mixture of adaptive harmony and Coelho's improved 
harmony search algorithms. The adaptive error strategy explained in section 4.1 
but not the penalty function concept is employed in this technique as well. 



6.5 Design Examples 

Seven different structural optimization programs are coded each of which is based 
on one of the above explained versions of the harmony search algorithms. Three 
steel space frames are designed using these seven different versions of harmony 
search algorithms and the optimum solutions determined are compared with each 
other in order to evaluate the performance of each version. 

6.5.1 Five-Story, Two-Bay Regular Steel Space Frame 

The plan and 3D views of the five-story, two-bay steel frame shown in the 
Figures 6.3 and 6.4 is a regular steel frame with 54 joints and 105 members that 
are grouped into 11 independent design variables. The frame is subjected to grav- 
ity loads as well as lateral loads that are computed as per ASCE 7-05 [28]. The de- 
sign dead and live loads are taken as 2.88kN/m^ and 2. 39kN/m^ respectively. The 
ground snow load is considered to be 0.755kN/m^ and a basic wind speed is 
105mph (65 m/s). The un-factored distributed gravity loads on the beams of the 
roof and floors are tabulated in Table 6.2. The following load combinations are 
considered in the design of the frame according to the code specification. 
1.2DH-1.6LH-0.5S, 1.2DH-0.5LH-1.6S, 1.2Dh-1.6Wh-0.5Lh-0.5S where D is the dead 
load, L represents the live load, S is the snow load and W is the wind load. 
The drift ratio limits of this frame are defined as 1.33 cm for inter story drift and 
6.67 cm for top story drift. Maximum deflection of beam members is restricted as 
1.67 cm. 



6 Harmony Search Algorithms in Structural Engineering 

Table 6.2 Beam gravity loading of the five-story, two bay steel frame 
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Beam Type 


Uniformly distributed load (kN/m) 


Dead Load Live Load Snow Load 


Roof Beams 
Floor Beams 


4.78 - 1.508 
4.78 5.76 



r^ 



6.00m 



(1] 



^^ 



-- (2) Jl 



(1) 



-6.00m- 



(2) 



<S^ 



}H 



i 



Fig. 6.3 Plan view of five-story, two bay steel frame 
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Fig. 6.4 3D View of the five-story, two bay steel frame 



Optimum design problem of the five-story, two-bay steel frame is solved by us- 
ing seven different versions of harmony search algorithms. In these algorithms the 
following harmony search parameters are used: harmony memory size (hms) = 20, 
pitch adjusting rate (par) = 0.3, harmony memory considering rate (hmcr) = 0.9 
and maximum iteration number = 50000. The optimum designs obtained from 
each of these algorithms are shown in Table 6.3. It is apparent from the table that 
the lightest weight is 261.128 kN which is obtained by the adaptive harmony 
search algorithm and the second lightest design is 261.360kN attained by the dy- 
namic harmony search method suggested in this study. The design histories of 
these algorithms for the best solutions are plotted in Fig. 6.5. It is apparent from 
the figure that the dynamic harmony search and adaptive harmony search algo- 
rithms show better performance than others. It is noticed that the minimum 
weight determined by the dynamic harmony search and adaptive harmony search 
algorithms are 12.3% less than the heaviest frame. 
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Table 6.3 Design results of the five-story, two bay steel frame 
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Member 
Group 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 



Type 

Beam 
Beam 
Column 
Column 
Column 
Column 
Column 
Column 
Column 
Column 
Column 



SHSAES 

W530X66 

W310X38.7 

W200X35.9 

W200X35.9 

W360X44 

W310X38.7 

W360X72 

W6 10X92 

W410X53 

W360X72 

W760X147 



SHSPF 

W360X39 

W3 10X38.7 

W360X39 

W200X46.1 

W610X101 

W530X66 

W4 10X60 

W1000X222 

W6 10X92 

W4 10X60 

Wl 100X390 



AHSPF 

W4 10X46.1 

W3 10X38.7 

W250X32.7 

W200X46.1 

W460X52 

W410X53 

W360X64 

W920X201 

W410X53 

W460X74 

W1000X258 





Max. Strength Ratio 0.975 


) 0.986 


0.936 






Top Drift(cm) 


4.83' 


' 5.264 


4.81 




Inter Story Drift(( 


:m) 1.333 


! 1.329 


1.331 




Maximum Iteration 50000 50000 : 


50000 






Weight (kN) 


278.196 268.172 261.128 


Member 
Group 


Type 


IHS 

(Mahdavi) 


GBHS 

(Mahdavi) 


IHSC 
(Coelho) 


DHS 

Present 

Study 


1 




Beam 


W530X66 


W5 30X74 


W5 30X74 


W460X52 


2 




Beam 


W3 10X38.7 


W3 60X44 


W3 60X44 


W250X38.5 


3 




Column 


W200X35.9 


W200X41.7 


W200X41.7 


W310X38.7 


4 




Column 


W200X35.9 


W200X41.7 


W200X41.7 


W200X35.9 


5 




Column 


W360X44 


W3 60X44 


W3 60X44 


W460X52 


6 




Column 


W3 10X38.7 


W3 60X44 


W3 60X44 


W360X51 


7 




Column 


W360X64 


W3 60X44 


W3 10X44.5 


W250X73 


8 




Column 


W6 10X82 


W6 10X92 


W610X113 


W610X101 


9 




Column 


W410X53 


W3 60X51 


W3 60X51 


W360X51 


10 




Column 


W360X64 


W4 10X60 


W4 10X60 


W460X74 


11 




Column 


W840X193 


W840X193 


W760X134 


W840X193 



Max. Strength Ratio 0.989 0.986 0.979 0.975 

TopDrift(cm) 4.763 4.579 4.73 5.074 

Inter Story Drift(cm) 1.331 1.325 1.333 1.33 

Maximum Iteration 50000 50000 50000 50000 

Weight (kN) 275.46 297.928 297.424 261.36 
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Fig. 6.5 Design histories of tlie five-story, two bay steel frame 



6.5.2 Ten-Story, Four-Bay Steel Space Frame 



The three dimensional view, side view and the plan of ten-story four-bay steel 
frame shown in Figures 6.6 and 6.7 is taken from [19, 22]. This frame has 220 
joints and 568 members which are collected in 25 member groups which are the 
independent design variables as shown in Figure 6.6. Inner roof beams, outer roof 
beams, inner floor beams and outer floor beams are subjected to 14.72kN/m, 
7.36kN/m, 21.43kN/m and 10.72kN/m uniformly distributed gravity loads respec- 
tively. Lateral forces acting at the level of each story of the steel space frame are 
tabulated in Table 6.3. Drift ratio limits are defined as /z / 400 where h is the story 
height for inter story drift and H MOO for top story drift where H is the total 
height of the structure. 
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Fig. 6.6 3-D view of ten-story, four-bay steel space frame 
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-18.30m 
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Fig. 6.7 Plan and side view of ten-story, four-bay steel space frame 
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Table 6.4 Lateral loads acting at the level of each story of ten-story, four-bay steel space 
frame 



Story Num- 


Windward 


Leeward 


ber 


(lb/ft) 


(kN/m) 


(lb/ft) 


(kN/m) 


1 


12.51 


0.1825 


127.38 


1.8585 


2 


28.68 


0.4184 


127.38 


1.8585 


3 


44.68 


0.6519 


127.38 


1.8585 


4 


156.86 


2.2886 


127.38 


1.8585 


5 


167.19 


2.4393 


127.38 


1.8585 


6 


176.13 


2.5698 


127.38 


1.8585 


7 


184.06 


2.6854 


127.38 


1.8585 


8 


191.21 


2.7897 


127.38 


1.8585 


9 


197.76 


2.8853 


127.38 


1.8585 


10 


101.9 


1.5743 


127.38 


1.8585 



Optimum design problem of this frame is solved under the design constraints 
described in section 2 by using seven different versions of harmony search algo- 
rithms described. In these algorithms the following harmony search parameters are 
used: harmony memory size (hms) = 30, pitch adjusting rate (par) = 0.3, harmony 
memory considering rate (hmcr) = 0.9, and maximum iteration number = 80000. 
The optimum designs obtained by each of these algorithms are shown in 
Table 6.5. It is clear from the table that the lightest weight is 1699. 88kN which is 
obtained by the dynamic harmony search method and the second lightest weight of 
the frame is 1714. 46kN attained by the adaptive harmony search algorithm. The 
design histories of these algorithms for the best solutions are plotted in Fig. 6.8. It 
is apparent from the figure that the dynamic harmony search algorithms shows 
steady convergence and outperforms others. It is noticed that the minimum weight 
determined by the dynamic harmony search is 7.3% less than the heaviest frame. 
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Table 6.5 Design results for ten-story, four-bay steel space frame 



Member 
Group 


SHSAES 


SHSPF 


AHSPF 


1 Column 


W3 10X28.3 


W150X22.5 


W3 10X28.3 


2 Column 


W3 10X28.3 


W200X86 


W3 10X28.3 


3 Column 


W360X32.9 


W760X173 


W360X39 


4 Beam 


W4 10X46.1 


W250X25.3 


W4 10X46.1 


5 Beam 


W4 10X46.1 


W410X38.8 


W460X52 


6 Column 


W410X46.1 


W410X114 


W410X38.8 


7 Column 


W460X52 


W760X196 


W410X38.8 


8 Column 


W530X66 


W840X176 


W6 10X82 


9 Beam 


W3 10X23.8 


W360X110 


W3 10X23.8 


10 Beam 


W460X60 


W690X152 


W460X52 


1 1 Column 


W250X67 


W410X100 


W200X35.9 


12 Column 


W250X73 


W460X128 


W250X80 


13 Column 


W3 10X44.5 


W690X170 


W360X44 


14 Beam 


W3 10X97 


W3 10X60 


W460X113 


15 Beam 


W460X128 


W530X85 


W460X113 


16 Column 


W530X85 


W3 10X97 


W530X85 


17 Column 


W3 10X107 


W310X117 


W460X128 


18 Column 


W530X150 


W530X85 


W610X217 


19 Beam 


W690X170 


W250X32.7 


W760X173 


20 Beam 


W310X117 


W4 10X46.1 


W530X150 


21 Column 


W760X196 


W3 10X97 


W690X217 


22 Column 


W840X176 


W200X59 


W760X173 


23 Column 


W150X29.8 


W4 10X60 


W 150X24 


24 Beam 


W250X73 


W250X32.7 


W250X49.1 


25 Beam 


W410X132 


W3 10X38.7 


W360X134 


Max. Strength Ratio 


0.99 


1 


0.995 


Top Drift(cm) 


8.158 


7.639 


7.695 


Inter Story Drift(cm) 


0.914 


0.914 


0.914 


Maximum Iteration 


80000 


80000 


80000 


Weight (kN) 


1756.56 


1800.28 


1714.46 
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Table 6.5 (continued) 



Member 
Group 


Type 


IHS 

(Mahdavi) 


GBHS 

(Mahdavi) 


IHSC 

(Coelho) 


DHS 

Present 
Study 


1 


Column 


W3 10X28.3 


W310X23.8 


W200X19.3 


W3 10X23.8 


2 


Column 


W3 10X28.3 


W360X32.9 


W530X85 


W3 10X28.3 


3 


Column 


W360X39 


W460X52 


W410X132 


W4 10X46.1 


4 


Beam 


W410X38.8 


W3 10X38.7 


W3 10X23.8 


W4 10X46.1 


5 


Beam 


W530X66 


W360X64 


W460X60 


W4 10X46.1 


6 


Column 


W530X66 


W530X150 


W3 10X107 


W4 10X46.1 


7 


Column 


W4 10X46.1 


W3 10X32.7 


W760X196 


W530X66 


8 


Column 


W410X38.8 


W690X125 


W760X257 


W530X66 


9 


Beam 


W3 10X23.8 


W3 10X23.8 


W530X109 


W3 10X23.8 


10 


Beam 


W460X60 


W360X32.9 


W360X134 


W460X52 


11 


Column 


W250X67 


W250X73 


W3 10X97 


W250X73 


12 


Column 


W250X73 


W250X58 


W250X101 


W250X67 


13 


Column 


W360X44 


W360X51 


W690X170 


W360X44 


14 


Beam 


W3 10X97 


W250X80 


W4 10X60 


W3 10X97 


15 


Beam 


W410X100 


W610X113 


W530X85 


W460X113 


16 


Column 


W530X85 


W530X85 


W250X73 


W6 10X92 


17 


Column 


W310X107 


W610X101 


W250X101 


W360X134 


18 


Column 


W530X150 


W690X140 


W530X92 


W460X128 


19 


Beam 


W690X170 


W610X174 


W410X38.8 


W840X176 


20 


Beam 


W360X162 


W610X155 


W410X46.1 


W360X134 


21 


Column 


W760X196 


W920X201 


W250X73 


W760X196 


22 


Column 


W690X170 


W 1000X296 


W200X59 


W840X176 


23 


Column 


W150X29.8 


W 150X24 


W4 10X60 


W 150X24 


24 


Beam 


W4 10X53 


W250X49.1 


W3 10X28.26 


W250X49.1 


25 


Beam 


W310X129 


W840X193 


W3 10X28.3 


W530X150 


Max. Stren; 


gth Ratio 


0.988 


0.961 


0.965 


0.976 


Top Drift(cm) 


7.774 


8.077 


7.589 


8.069 


Inter Story Drift(cm) 


0.915 


0.914 


0.913 


0.912 


Maximum Iteration 


80000 


80000 


80000 


80000 


Weight (kN) 


1739.47 


1842.95 


1773.51 


1699.88 
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O- SHS(Penalty Function) 
a- SHS(Feasible Solution) 

X- IHS(IVIahdavi) 
IHS(Coelho) 
GBHS(IVIahdavi) 
Dynamic(IHS Coello+Adaptive) 




10000 20000 30000 40000 50000 60000 70000 80000 

Iteration 



Fig. 6.8 Design histories of ten-story, four-bay steel space frame 



6.5.3 Twenty-Story, 1860-Member, Steel Space Frame 



The three dimensional view and plan of twenty-story, 1860-member steel space 
frame are illustrated in Figures 6.9 and 6.10. The frame has 820 joints and 1860 
members which are collected in 86 independent design variables. The member 
grouping is given in Figure 6.9. The frame is subjected to gravity loads as well as 
lateral loads that are computed according to ASCE 7-05 [28]. The design dead and 
live loads are taken as 2.88kN/m~ and 2.39kN/m^ respectively. Basic wind 
speed is considered as 85mph (38 m/s). The following load combinations are con- 
sidered in the design of the frame according to code specification [25]. 
1.2DH-1.3WZ-H0.5L-H0.5S and 1.2Dh-1.3WXh-0.5Lh-0.5S where D is the dead load, 
L represents the live load, S is the snow load and WX, WZ are the wind loads in 
the global X and Z axis respectively. Drift ratio limits are defined as h/400 
where h is the story height for inter story drift and H/400 for top story drift 
where H is the height of structure. 
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Fig. 6.9 Plan view of twenty- story, 1860 member steel space frame 
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Fig. 6.9 (continued) 
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Fig. 6.10 3-D view of twenty-story, 1860 member steel space frame 
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This frame which has 1 860 members is also designed seven times using differ- 
ent versions of harmony search algorithms. In these runs the harmony search pa- 
rameters are selected as: harmony memory size (hms) = 50, the pitch adjusting 
rate (par) = 0.3, the harmony memory considering rate (hmcr) = 0.9, the maximum 
iteration number = 80000. The optimum designs obtained by each of these algo- 
rithms are given in Table 6.6. It is apparent from the table that the best design is 
obtained by dynamic harmony search method which has the minimum weight of 
4716.576kN. The second best design is obtained by the adaptive harmony search 
algorithm (AHS) as 4932.0 12kN. Difference between these results is only 4%. 
However the minimum weights of best designs obtained by other harmony search 
algorithms are around 6000kN. Therefore, it can be stated that the dynamic and 
adaptive harmony search methods demonstrated better performance than the other 
versions of harmony search methods. The design histories of each harmony search 
method are shown in Fig. 6.11. The figure clearly reveals the fact that the dynamic 
and adaptive harmony search methods perform better than the other versions of 
the harmony search algorithms from the beginning of the design cycles. 

Table 6.6 Design results for twenty-story, 1860-member steel space frame 



Beam Type 


Member 
Group 


SHSAES 


SHSPF 


AHSPF 


Outer 


1 


W4 10X67 


W4 10X53 


W460X60 


Interior 


2 


W460X52 


W460X68 


W460X60 


Columns 
Story 


Member 
Group 


SHSAES 


SHSPF 


AHSPF 


20,19 


3 


W4 10X85 


W250X73 


W3 10X38.7 


19,18 


6 


W410X132 


W690X125 


W410X38.8 


16,15 


9 


W4 10X60 


W200X41.7 


W200X22.5 


14,13 


15 


W920X223 


W460X68 


W200X26.6 


12,11 


21 


W920X271 


W610X174 


W360X39 


10,9 


29 


W 1000X3 14 


W840X176 


W460X60 


8,7 


37 


W150X29.8 


W200X22.5 


W250X25.3 


6,5 


48 


W410X46.1 


W460X128 


W3 10X28.3 


4,3 


59 


W 1000X272 


W840X176 


W360X51 


4,3 


72 


W3 10X44.5 


W200X86 


W250X32.7 


2,1 


73 


W 1000X272 


W840X251 


W690X125 


Top Story Drift (cm) 


9.013 


8.777 


9.809 


Inter-Story Drift (cm) 


0.742 


0.738 


0.75 


Max. Streni 


gth Ratio 


0.84 


0.837 


1 


Weight (kN) 


6319.554 


6204.204 


4932.012 
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Table 6.6 (continued) 



Beam 


Member 


HIS 


GBHS 


HIS 


DHS Present 


Type 


Group 


(Mahdavi) 


(Mahdavi) 


(Coelho) 


Study 


Outer 


1 


W4 10X60 


W4 10X60 


W360X51 


W4 10X53 


Interior 


2 


W460X60 


W4 10X60 


W460X60 


W460X60 


Columns 


Member 


HIS 


GBHS 


HIS 


DHS 

(Present 

Study) 


Story 


Group 


(Mahdavi) 


(Mahdavi) 


(Coelho) 


20,19 


3 


W200X46.1 


W250X80 


W360X51 


W4 10X53 


19,18 


6 


W3 10X52 


W460X82 


W460X128 


W410X53 


16,15 


9 


W200X41.7 


W200X22.5 


W150X37.1 


W200X22.5 


14,13 


15 


W920X201 


W760X161 


W4 10X60 


W3 10X28.3 


12,11 


21 


W920X201 


W920X345 


W690X265 


W3 10X32.7 


10,9 


29 


W1000X258 


Wl 100X499 


W 1000X4 12 


W360X39 


8,7 


37 


W250X73 


W6 10X82 


W360X39 


W3 10X28.3 


6,5 


48 


W840X193 


W760X147 


W760X196 


W3 10X38.7 


4,3 


59 


W 1000X222 


W 1000X249 


W760X257 


W610X101 


2,1 


73 


Wl 100X343 


W 1000X249 


Wl 100X433 


W610X101 


Top Story Drift (cm) 


8.954 


8.76 


9.576 


10.02 


Inter-Story Drift (cm) 


0.75 


0.744 


0.733 


0.748 


Max. Strength Ratio 


0.795 


0.69 


0.892 


1 


Weig 


;ht (kN) 


6259.736 


6431.886 


6337.728 


4716.756 
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n— SHS(Feasible Solution) 
■C^SHS(Penalty Function) 

^^ IHS(Mahclavi) 

^1^ IHS(Coelho) 

O-GBHS(Mahciavi) 

H — Dynamic(IHS Coello+Adaptive) 




10000 20000 30000 40000 50000 60000 70000 80000 

Iteration 



Fig. 6.11 Design histories of for twenty-story, 1860-member steel space frame 

Table 6.7 Performance evaluation of seven different versions of the harmony search algo- 
rithms in the design examples 



Design Examples SHSAES SHSPF AHSPF 



Five-story frame 5 

Ten-story frame 4 

Twenty-story ^ 
frame 



IHS 

(Mahdavi) (Mahdavi) (Coelho) 

4 
3 



GBHS IHSC DHS 
Present 
Study 
7 6 2 

7 5 1 



6.6 Conclusions 



Seven different structural optimization algorithms are developed that are based on 
seven different versions of the harmony search algorithms that are recently developed. 
Three steel space frames are designed by these algorithms to evaluate their perform- 
ance in finding the optimum solutions. All of these alternative harmony search algo- 
rithms are shown to be reliable, robust and effective algorithms. However, two ver- 
sions among the all; adaptive harmony search method and dynamic harmony search 
method show better performance than the other versions. Particularly in the third de- 
sign example where there are relatively large number of design variables and bigger 
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design domain, the dynamic harmony search method has succeeded to find the opti- 
mum weight which is 25.36% less than the one determined by the standard harmony 
search algorithm. The performance evaluation of all these techniques in the design of 
three steel space frames considered is summarized in Table 6.7. 
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Chapter 7 

Waveform Optimization for Integrated Radar 
and Communication Systems Using 
Meta-Heuristic Algorithms 



Momin Jamil and Hans-Jiirgen Zepernick 



Abstract. Integration of multiple functions such as navigation and radar tasks with 
communication applications has attracted substantial interest in recent years. In this 
chapter, we therefore focus on the waveform optimization for such integrated sys- 
tems based on Oppermann sequences. These sequences are defined by a number 
of parameters that can be chosen to design sequence sets for a wide range of per- 
formance characteristics. It will be shown that meta-heuristic algorithms are well- 
suited to find the optimal parameters for these sequences. The motivation behind the 
use of biologically inspired heuristic and/or meta-heuristic algorithms is due to their 
ability to solve large, complex, and dynamic problems. 

7.1 Introduction 

In recent years, integration of multiple functions such as navigation and radar tasks 
with communication applications has sparked a number of research initiatives. This 
includes research on future signals for hybrid receivers for Global Navigation Satel- 
lite Systems (GNSS)/communication and others tasks. The many benefits of multi- 
functionality range from reducing costs and probability of intercept to offering 
tolerable co-site interference. While navigation and radar applications require wave- 
form designs that offer excellent autocorrelation characteristics, the target for com- 
munication applications is on sets of waveforms with minimum crosscorrelation 
among the sequences in the set. In the former case, typically only a single sequence 
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is needed while in the latter case many sequences are required to support access of 
multiple users to the common transmission medium. As excellent autocorrelation 
properties come at the expense of crosscorrelation characteristics and vice versa, 
a related waveform optimization problem has to be posed and solved taking into 
account these conflicting requirements. 

As far as the integration of radar and communication functionalities are con- 
cerned, the Office of Naval Research in 1996 launched the Advanced Multifunction 
Radio Frequency Concept (AMRFC) program 1 1 8 53 1 . This major program was mo- 
tivated by the lack of integration of radar, communications, and electronic warfare 
functions which resulted in a significant increase of the number of topside anten- 
nas. Furthermore, it was realized that the lack of integration may also cause severe 
problems with antenna blockage and difficulties with own-ship electromagnetic in- 
terference. Also, a large number of antennas puts stress on maintenance resources. 
The concepts developed within the AMRFC program are centered around suitable 
broadband RF apertures that can cope with simultaneous operation of multiple func- 
tions and as such focuses on the rather expensive radio frequency (RF) front-end. 
A different approach on the basis of linear frequency modulated (LFM) waveforms, 
also known as chirps, has been proposed in B9l . In order to enhance the orthogo- 
nality among the signals and to support distinct separation of the different functions, 
it uses up-chirps for the communications component and down-chirps for the radar 
functionality of the integrated system. In this way, the suggested chirp signals al- 
low for the radar and communication data to be simultaneously transmitted and 
received using some standard antenna array. Noting the inherent connection of the 
chirp-based integration concept to spread spectrum techniques, the work of II59II60I 
investigated integrated radar and communication systems with the help of bipolar 
pseudo noise (PN) sequences, namely m-sequences IIT4ll63l . However, one of the 
severe drawbacks of m-sequences with respect to radar applications is their poor 
Doppler tolerance f3T| and related problems of detecting multiple targets. These 
and related designs such as polyphase Barker sequences are optimized only with 
respect to the zero Doppler cut of the ambiguity function but produce much higher 
interference levels in the presence of Doppler shifted waveforms. As for the applica- 
tion to communications, large sets of m-sequences that would be needed to support 
multiple-access of many users have typically rather poor crosscorrelation proper- 
ties 163 1 . As a consequence, they are generally only used as components of more 
complex designs such as Gold sequences. On the other hand, the large advances in 
modern integrated circuit technologies would facilitate an efficient implementation 
of more advanced sequence designs such as complex-valued sequences. Clearly, ef- 
ficient optimization methods are needed to find suitable waveform and sequence 
designs for different applications. 

Over the last few decades, researchers around the world have developed a vast 
number of algorithms to solve different optimization problems. Many of these al- 
gorithms are based on numerical linear and non-linear programming methods. As 
a result, the related algorithms require substantial gradient information and try to 
improve the solution in the proximity of an initial starting point. As a consequence, 
these methods provide useful strategies to find the global optimum for rather ideal 
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and simple models. However, if the objective function and constraints have multi- 
ple or sharp peaks, these methods tend to become unstable. Most of the real world 
problems turn out to be too complex and difficult to solve using numerical based op- 
timization methods as these tend to fail or are even unable to solve them. There exist 
also several direct search approaches which use no gradient information such as the 
Hooke and Jeeves method [ 17|, Nelder-Mead simplex method B4ll . the Rosenbrock 
method |50|, and the Powell method |47|. Common to these methods is that they 
take some basic approach of heading downhill from an arbitrary starting point but 
differ in deciding in which direction to and how far to move. Accordingly, the final 
outcome depends somewhat on the initial guess of the starting point. This would 
not be a major shortcoming if the parameter space is well behaved, i.e. if it con- 
tains a single, well-defined minimum. However, if the parameter space contains 
many local minima, as may be the case in waveform optimization, it can be more 
difficult for such traditional approaches to find the global minimum. In contrast to 
population based algorithms, these direct searches cannot explore the search space 
effectively in different directions simultaneously. Successive improvements can be 
made to speed up the downhill movement of the algorithms but this does not im- 
prove the algorithms ability to find the global minimum instead of converging to a 
local minimum. 

The drawbacks of numerical methods motivated researchers to adopt ideas from 
nature and to translate them to solve problems in engineering sciences. This has led 
to the inception of many biologically inspired heuristic or meta-heuristic algorithms 
to solve challenging optimization problems. The word "meta" means beyond or 
higher and "heuristic" means to find or to discover by trial and error. These methods 
have proven to be efficient in handling computationally complex problems. They 
aim at defining effective general purpose methods to explore the solution space and 
avoid tailoring them to a specific problem. Due to their general purpose nature, they 
can be applied to a wide range of problems. Meta-heuristic algorithms are also re- 
ferred to as black-box algorithms as they exploit limited knowledge about the prob- 
lem to be solved. As no gradient or Hessian matrix information is required for their 
operation, they are also referred to as derivative-free or zero-order algorithms 111 . 
The term zero-order implies that only the function values are used to establish the 
search vector. Moreover, the function to be optimized does not necessarily have to be 
continuous or differentiable and may also be accompanied by a set of constraints. 
The choice of method for solving a particular problem depends primarily on the 
type and characteristics of the problem at hand. It must be stressed that the goal of 
a particular method used is to find the "best" solution of some sort to a problem 
compared to finding the optimal solution. In this context, the term "best" refers to 
an acceptable or satisfactory solution to the problem. This could be the absolute best 
solution from a set of candidate solutions or may be any of the candidate solutions. 
The requirements and characteristics of the problem determine if the overall best 
solution can be found 110115411 . 

Nature has an evolution span of millions or even billions of years. In all these 
years, it has mastered the art of finding a perfect solution to almost all the prob- 
lems it has been confronted with. As mentioned above, the development of nature 
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inspired optimization algorithms has been an area of active research during recent 
years and resulted in many approaches such as genetic algorithms (GA), ant colony 
optimization (AGO), bee algorithms (BA), artificial bee algorithms (ABC), particle 
swarm optimization (PSO), simulated annealing (SA), harmony search (HS), firefly 
algorithms (FA), and artificial immune systems (AIS). The interested reader may be 
referred to Bl [T0l[T5ll3T1l5 111561 l62l and the reference therein for more details and 
discussions on these topics. 

Given the vast amount of available optimization methods, their application in 
waveform design also stretches from simple searches over more sophisticated and 
computational demanding realizations to the use of meta-heuristic algorithms. A 
simple computer search has been used in 1,45 J to obtained sets of sequences with var- 
ious combinations of sequence parameters. In |[58l . the optimization of orthogonal 
polyphase spreading sequences for wireless data applications is reported. It uses a 
built-in standard 'fmin' function provided in the numerical computing environment 
MATLAB. In particular, the related functions support multidimensional uncon- 
strained nonlinear minimization including the Nelder-Mead direct search method. 
As the utilized cost functions in terms of average mean-square autocorrelation and 
crosscorrelation are very irregular and may have several local minima, the authors 
report the dependency of the optimization outcome on the starting point and cor- 
responding convergence to different local minima. A similar optimization problem 
for complex- valued spreading sequences has been investigated in |9| using a global 
optimization method based on a modified bridging method. In order to solve the 
related complex optimization problem having a non-linear cost function and a non- 
linear constraint, a bridged function is used in the search for the global minimum 
such that the algorithm does not get stuck in a local minimum. Given that cost func- 
tions in waveform optimization are often highly irregular with many local minima or 
are even discontinuous, evolutionary algorithms have gained increased attention in 
the design of waveforms with respect to communication and radar applications. An 
evolutionary approach for designing complex spreading codes for direct sequence 
code-division multiple-access (DS-CDMA) systems has been proposed in 04211431 . 
In particular, a multi-objective evolutionary approach is used to search for solutions 
that satisfy simultaneous objectives posed on autocorrelation and crosscorrelation 
properties. This approach turned out to be beneficial in the communications field 
for designing large number of spreading sequence sets with a wide range of corre- 
lation properties. In |7J, genetic algorithms have been used to design PN sequence 
families with bounded correlation properties. It is claimed that this approach can 
produce sequences of any length and superior performance compared to the well- 
known Gold sequences. A number of recent works has also been reported for the use 
of evolutional algorithms in the field of radar applications. In 121, an evolutionary 
algorithm is applied to determine a suite of optimal waveforms to simultaneously 
perform different surveillance missions such as ground moving target indication, 
airborne moving target indication, and synthetic aperture radar. The authors have 
shown that evolutionary algorithms are well suited to design optimal waveforms 
for multi-mission objectives such as peak sidelobe levels, integrated sidelobe levels, 
pulse integration, and revisit time. The work reported in ||38]| used meta-heuristic 
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algorithms to optimize waveforms with sparse spectrum for radar applications in 
the high frequency band. In particular, a genetic algorithm and particle swarm op- 
timization are used to produce optimal waveforms with acceptable autocorrelation 
sidelobes. It is concluded that the particle swarm optimization is simpler and faster 
than the genetic algorithm. They are of the opinion that computational efficiency of 
particle swarm optimization is comparable or would be even better than the adaptive 
method of 1 40 1 . 

In view of the above, this chapter considers integrated radar and communication 
systems based on waveforms known as polyphase sequences. In order to account for 
the waveform design challenges associated with such integrated systems, we have 
compared performance and potential application scenarios of different classes of 
polyphase pulse compression sequences in our earlier studies reported in 125112611 . 
Specifically, Oppermann sequences have been revealed in these studies to poten- 
tially better support the considered integration as these allow for the design not only 
of families with a wide range of correlations but also support a variety of charac- 
teristics with respect to the ambiguity function, i.e. delay-Doppler tolerance. These 
sequences provide a number of parameters that can be chosen to design sequences 
for a wide range of performance characteristics. It will be shown that meta-heuristic 
algorithms are well-suited to find the optimal parameters for these sequences. Nu- 
merical results will be provided for optimal Oppermann sequences obtained with 
meta-heuristic algorithms. 

The rest of this chapter is organized as follows. In Section 17.21 an overview 
of meta-heuristic algorithms is presented. A brief discussion of polyphase se- 
quences and the definition of Oppermann sequences is provided in Section |73] In 
Section [741 performance measures are introduced. Numerical examples are given 
in Sectionl73] In Section 1731 conclusions are drawn. 



7.2 Meta-Heuristic Algorithms 

Meta-heuristic algorithms, also referred to as meta-heuristics for brevity, belong to 
a branch of stochastic optimization. They are utilized by both engineers and sci- 
entists wishing to optimize solutions to problems that are intractable by conven- 
tional methods. Meta-heuristic methods consist of two major components known 
as randomization and selection of the best solutions. The first component avoids 
that an algorithm gets trapped in a local optimum but also increases the diversity 
of the potential solutions while the latter component ensures convergence towards 
the optimal value Ill0ll61ll62l . A good combination of these two components usually 
ensures that the global optimum is achievable. The popularity of these algorithms 
stems from their ability to solve large, complex and dynamic problems. The effi- 
ciency of these algorithms or solutions they provide is a measure of their ability to 
reach an acceptable solution within a reasonable time frame. 

The applications of meta-heuristics are broad, versatile and diverse. Application 
areas include controller design, applied mathematics, power systems, physics, data 
mining, fuzzy systems and many others. In this chapter, we will apply some of these 
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algorithms to pseudo random signal processing with focus on waveform design for 
integrated radar and communication systems. For this purpose, meta-heuristic algo- 
rithms may be classified as being either population-based or flight/trajectory-based. 
Genetic algorithms, for example, can be classified as a population-based method 
while particle swarm optimization utilizes multiple particles to reach the optimal 
solution. On the other hand, simulated annealing uses a single solution that moves 
through the search space or design space in a piecewise manner. The essence of the 
algorithm is always to accept a better solution, whereas a not-so-good solution is 
accepted with certain probability. In the sequel, selected state-of-the-art zero order 
and meta-heuristic algorithms are presented. 

7.2.1 Particle Swarm Optimization 

The PSO is a population-based stochastic optimization technique which has been 
inspired by social behavior of a flock of birds, school of fishes and swarm of bees 
as proposed by Eberhart and Kennedy ll30l . Since its inception, there have now as 
many as about 20 different variants of PSO been proposed while remaining still an 
active area of research. It shares many similarities with genetic and virtual ant al- 
gorithms including concepts such as population initialization with random solutions 
and search for a global optimum solution in successive generations. However, the 
evolution operators like mutation and crossover as well as encoding or decoding of 
the parameters into binary strings are not used with PSO algorithms. Instead, it uses 
a real-number randomness and global communication among the swarm population. 
Accordingly, each member in the swarm adapts its search patterns by learning from 
its own experiences of the other members. A member in the swarm is referred to as a 
particle and represents a potential solution which is a point in the search space. The 
global optimum is regarded as the location of food I.37J . Each particle has a fitness 
value and a velocity to adjust its flying direction by learning from the best experi- 
ences of the swarm to search for the global optimum in the D-dimensional solution 
space. In our case, the dimension D of the problem is given by the number of pa- 
rameters that are available for optimization for a given class of sequences. In order 
to avoid haphazard movements of the particles in the search space, upper and lower 
bounds are usually specified on the velocity. If the velocity v falls below the spec- 
ified lower bound, it is set to v,„,„ as a measure to prevent in-sufficient exploration 
of the search space. On the other hand, if the velocity exceeds the specified upper 
bound, it is set to v,nax in order to avoid particles moving away from or past a good 
solution. Similarly, the actual search range for a D-dimensional problem is usually 
also constrained to a given interval [cmimCmaxf , in order to restrain the particles 
moving on the search boundary. 

The standard PSO uses both the personal best, pbest, with respect to the loca- 
tion achieved by an individual particle and the global best, gbest, referring to the 
best solution/location among all particles in the swarm 1 1 30 1 . The concept of per- 
sonal best is primarily used to increase the diversity in finding a solution and to 
avoid pulling all the particles to the global best. This may cause the algorithm to 
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converge prematurely without finding the overall best solution. However, such 
diversity can also be simulated by using some kind of randomness 16 111621 . Based 
on this observation, 1621 argues that there is no need to use the personal best, unless 
the optimization problem is highly nonlinear and multi-modal. This version of the 
PSO is known as accelerated PSO (APSO) | 



7.2.2 Harmony Search 

A new type of heuristic optimization algorithms known as harmony search (HS) 
was developed by Lee and Geem [31 1. It formalizes the musician improvisation 
process, i.e. inventing music while performing, into a quantitative optimization pro- 
cess. It comprises of the following parts: (1) Usage of harmony; (2) pitch adjust- 
ment; and (3) randomization. In an HS algorithm, each musician (decision variable) 
plays (generates) a note (value) for finding a best harmony (global optimum). In 
other words, a harmony translates to an optimization solution vector and the mu- 
sician's improvisation corresponds to local and global search schemes in terms of 
optimization. Solutions of the optimization process correspond to a musician while 
the harmony of the notes generated by a musician corresponds to the fitness of the 
solution. The pitch adjustment rate rpa G [0.1,0.5] and so-called harmony memory 
raccept € [0.7,0.95] ensure that the best harmonies established at some point will 
be carried over to a new harmony memory. For a detailed discussion on harmony 
search, the interested reader is referred to II31II61||62| and the references therein. 



7.2.3 Adaptive Simulated Annealing 



The classical SA algorithm 11101154116111621 relies on the Boltzmann sampling dis- 
tribution. It comprises of components such as the probability density function of 
the state space ^(7) with 7 being the current solution, an acceptance probability 
function h{AE) with respect to the difference in system energy AE between two 
design vectors, and an annealing schedule for temperature T{k) with annealing time 
k using Boltzmann annealing. An enhanced version of the classical SA known as 
adaptive SA (ASA) has been proposed in I20ll211i22li23l including comparisons, test 
case studies and applications. In contrast to SA, the annealing schedule for tempera- 
ture T{k) decreases exponentially in annealing time k. In addition, re-annealing and 
quenching is introduced with ASA that allows for adaptation to changing sensitivi- 
ties in multidimensional parameter spaces. 

7.2.4 Artificial Bee Colony Algorithm 

The ABC algorithm was proposed by Karaboga fZfl in 2005. It simulates the forag- 
ing behavior associated with bee colonies. A colony of honey bees can extend itself 
over long distances, sometimes more than 10 kilometers and in multiple directions 
simultaneously to exploit a large number of food sources. In a bee colony, tasks are 
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divided among the specialized individuals or bees, namely employed, onlooker and 
scout bees. The population in a bee colony is divided into two halves. The first half 
of the population is comprised of employed bees while the second half includes the 
onlooker bees. The foraging process begins in a colony by scout bees being sent to 
search for promising food sources. Scout bees move from one food source to an- 
other in a random fashion. Employed bees perform duties of exploiting the possible 
food sources and passing on the information about the quality of the food source 
to the onlookers bee. The decision taken by onlooker bees to exploit a potential 
food source depends on the information provided by the employed bees. ABC al- 
gorithms have been used to solve both unconstrained and constrained optimization 
problems Il3] |27ll28ll29j . It requires only a few control parameters such as the colony 
size and maximum number of cycles ||29i| . 

7.2.5 Preliminaries for Waveform Design 

From this point onwards, we will consider two-dimensional optimization problems 
unless otherwise specified. In the context of waveform design using Oppermann se- 
quences, the term swarm in APSO, harmonies in HS, bees in ABC and candidate 
points in ASA relate to the parameters m and n which define a specific sequence 
family. In all these algorithms, the control parameters are defined in the initializa- 
tion phase. Initially, all the algorithms start with a population randomly distributed 
except for ASA, which starts with the initial guess in the search space. In each step 
of the algorithms, there is always a solution or a set of solutions, representing the 
current state of the algorithm. These solutions are used to generate phases of the Op- 
permann sequences (see Section l731) . In order to distinguish good waveform designs 
from inferior designs, waveform characteristics such as aperiod correlations, figure 
of merit, and integrated sidelobe measures are computed. The interested reader can 
find pseudo code of HS in |621, ASA in [52J, and ABC in 1271 while details of the 
APSO can be found in l6ni62ll . 

7.3 Polyphase Sequences and Their AppHcations 

The history of complex- valued sequences ranges back as far as the 1950s when 
polyphase sequences where considered in many research laboratories. As the re- 
lated research outcomes were reported mainly in classified documents with limited 
access, a broader audience was first reached with the work in |16| on phase shift 
pulse sequences. In the following decades, many complex-valued sequences have 
been proposed and analyzed with their applications ranging from radar systems to 
spread-spectrum communication systems. In particular, polyphase sequences have 
gained increased attention due to their ability to match regular phase shift key- 
ing modulation schemes. In addition, the advances in integrated circuit technolo- 
gies have paved the way for moving from simple binary sequences to implementa- 
tions of complex-valued sequences and related more involved pseudo random sig- 
nal processing. In the sequel, we consider polyphase sequences and will shed some 
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light on their potential to serve in integrated radar and communication systems. In 
particular, the family of Oppermann sequences |45 1 are considered in more detail as 
they offer the system designer large sets of sequences with a wide range of correla- 
tion properties compared to other classes of polyphase sequences. 

7. 3. 1 Polyphase Sequences for Radar Systems 

Pseudo random sequences and the related signal processing have emerged from 
space and military applications. In this context, the concept of pulse compres- 
sion, i.e. expanded pulses with large time-bandwidth products, has been utilized 
in radar systems. This type of signals offer high range resolution as they can obtain 
high pulse energy and large pulse width. As an alternative to frequency-modulated 
signals, pulse compression sequences have been subject of many studies 114113211 . 
Polyphase sequences are known to have better Doppler tolerance for a broader 
range-Doppler coverage than binary sequences Il8l l32l 14111461 . These sequences 
can be derived from the phase history of chirp or step chirp analog signals and 
can be processed digitally |36|. In radar applications, the performance of differ- 
ent polyphase sequences can be compared in terms of delay or range tolerance 
using measures such as the autocorrelation function, mainlobe-to-total-sidelobe ra- 
tio and peak-to-sidelobe ratio. The sensitivity of a particular waveform design to- 
wards Doppler shifts in case of moving targets can be characterized by using the 
ambiguity function. As there exist no analytical method that would allow for syn- 
thesizing the desired waveform given its desired ambiguity function, more prac- 
tical optimization approaches are needed to facilitate such designs. For example, 
the design of a particular radar waveform may be first aiming for optimization of 
autocorrelation properties with respect to range characteristics followed by eval- 
uating the ambiguity function to identify the Doppler tolerance of the deduced 
sequence. 

As far as radar applications are concerned, Frank sequences |12| were the first 
polyphase sequences used in pulse compression radar I.46J . They can only be de- 
signed for perfect square lengths, therefore, they have limited family size. Later 
in ll34l modified versions of Frank sequences were obtained by permuting their 
phase history. The modified versions are referred to as PI and P2 sequences. 
Rapajic and Kennedy in BSl proposed a new class of sequences, known as Px se- 
quences. These sequences have superior performance in terms of integrated side- 
lobe levels compared to Frank, PI, and P2 sequences. However, for even square 
root sequence lengths, their performance is the same as for P2 sequences. In ||35]| . 
the families of P3 and P4 sequences were proposed that can be constructed for 
any length. The authors of ll6l [T3l generalized the ideas behind Frank sequences 
resulting in Frank-Zadoff-Chu (FZC) sequences which can also be designed for 
any length. Several performance aspects of the aforementioned classes of polyphase 
sequences with respect to radar applications have been discussed in literature 
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7.3.2 Polyphase Sequences for Communication Systems 

A major boost for the application of pseudo random sequences in the field of 
communication systems was given by the development of cellular mobile com- 
munication systems and spread-spectrum based radios for indoor communication. 
In particular, the CDMA system for digital cellular phone applications by Qual- 
comm Incorporated and the family of IEEE802.il standards for wireless local 
area networks (WLANs) has taken the theoretical concepts into practical systems. 
The main classes of sequences used with these systems are Walsh-Hadamard se- 
quences II1III55L OT-sequences 11111631 , Barker codes III1II63L and complementary 
code keying based modulation 1^1 . Subsequently, with the advent of the third 
generation of mobile communication systems, more advanced spread-spectrum 
techniques such as orthogonal variable spreading factor sequences fl^ and complex- 
valued short scrambling sequences have been utilized. In contrast to radar 
applications where it is usual sufficient to have a single sequence with good au- 
tocorrelation characteristics, communication systems require a set of sequences to 
facilitate simultaneous channel access to a number of users. Clearly, minimum 
crosscorrelation among the sequences is a major design consideration in this case. 
Given the large advances in modern integrated circuit technologies, it has become 
feasible to implement complex-valued sequence designs including polyphase se- 
quences such as Frank sequences, FZC sequences, and Oppermann sequences. 

7.3.3 Application of Oppermann Sequences for Integrated Radar 
and Communication Systems 

Given the insights from the brief overview on polyphase sequences from the view- 
point of radar and communication applications, it can be concluded that more flex- 
ible waveform designs are needed to address the conflicting objectives of these two 
applications. Our earlier research ["SFi^Sl on this topic has revealed that Opper- 
mann sequences may serve favorable in such integrated radar and communication 
systems compared to conventional waveform designs. This is mainly due to the fact 
that families of Oppermann sequences can be designed for a wide range of correla- 
tion properties. For any given sequence length, Oppermann sequences are defined by 
three parameters. These parameters can be used in an optimization process to control 
the progression of the autocorrelation function, crosscorrelation function, the power 
spectral density and characteristics of the ambiguity function. Due to space limi- 
tations, however, we will concentrate here on range (autocorrelation) and multiple 
access (crosscorrelation) characteristics. On the other hand, inclusion of moving tar- 
gets and the related Doppler shifts into the framework of meta-heuristic algorithms 
may be addressed in our future research considering ambiguity and cross-ambiguity 
functions. 

In this chapter, we consider weighted pulse trains that can be described by a 
complex envelope as 
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^^(O = ^E''.(0rect(^^j (7.1) 

where T = NTc is the duration of the xth pulse train while Tc and T^ < T^, respec- 
tively, denote the repetition period and the width of each rectangular pulse 

^,f^)J>'"-*^'^* ,7.2, 

\TwJ otherwise 



The elements Ux{i), / = 0, 1 , . . . ,N— 1, of the xth complex-valued sequence Ux of 
length A' represent the weights of the pulse train in (17.1b . In general, these elements 
are given for a polyphase sequence as 

Ux{i) = exp [j(Px{i)] , j = V^ (7.3) 

where the set of N phases {(px(0) , (Px{l) , ■ ■ ■ , (Px{N — 1)} are referred to as phase 
sequence. In particular, the phase (Px{i) of the /th element Mx(0 ofthexthOppermann 
sequence Ux = [ux{0),Ux{l),. .. ,Ux{N — 1)] of length N taken from a family or set 
"^ of sequences is given as 

(Px{i) = ^ [x'"{i+ iy+{i+ ir+x{i+ l)N] (7.4) 

where 1 <x<N— 1, Q<i<N— 1 and integers ( are relatively prime to the length 
N. The maximum size of a family '^ of Oppermann sequences is obtained as N ~ 
1 when the length A^ of the sequences is a prime number. A particular family of 
Oppermann sequences is defined by the real-valued parameters m, n, and p. All the 
sequences in a family have the same magnitude of the autocorrelation function for 
a fixed combination of these three parameters. In |45|, it has been shown that the 
magnitude of the autocorrelation function depends only on the parameter n if the 
parameter p—\. For this case, the autocorrelation magnitude follows the expression 
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(7.5) 



In the sequel, we therefore focus on the case of p =\ which leaves us with m and n 
as free parameters for use in an optimized waveform design. 

Due to the general definition of Oppermann sequences, they include some more 
specific sequences. For example, for the parameters m = 2, n = — oo, p = I, FZC 
sequences can be generated. As such, application of the considered meta-heuristic 
algorithms to these more specific sequences is straightforward. 
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7.4 Performance Measures 

In the following sections, the definitions of the measures used in the performance 
comparison of the considered Oppermann sequences will be given. Specifically, let 
an Oppermann sequence of length A^ be denoted as Ux = [ux{0),Ux{l),--- ,Ux{N—l)] 
where subscript 1 < x <U relates to the jcth sequence U;c taken from a given set "^ 
of size U. 



7.4.1 Aperiodic Correlation Measures 

In order to quantify the degree of similarity between different sequences from a 
given set or between a given sequence and a shifted version of it, respectively, au- 
tocorrelation and crosscorrelation measures are usual considered. In many fields, 
aperiodic signals need to be processed which occur only once within a considerable 
time span and appear to the application as more or less singular events. Accord- 
ingly, the aperiodic crosscorrelation (ACC) between two complex-valued sequences 
Ux = [ux{0),Ux{l),...,Ux{N-l)] andUy^[uy{0),Uy{l), . . . ,Uy{N-l)] of length A? at 
discrete shift / is given as 011II63II 

i'^X Ux{i)u;{i + l),0<l<N-l 

jj Z ux{i-l)u;{i), 1-N<1<0 ^'-^f 

i=0 



Cxy{l)-- 



lo, |/|>^ 



where (•)* denotes the complex conjugate of the argument (•). In case of Uv = Uy, 
( 17.61 ) is referred to as aperiodic autocorrelation (AAC) and is denoted as Cx{l) — 

Cxxil). 

In addition to ACC and AAC, it is often more realistic to incorporate the whole 
range of possible correlation values into the performance evaluation of a given set of 
sequences rather than considering only peak values of aperiodic correlations. In this 
context, mean-square values from AAC and ACC may be used in favor of worst case 
scenarios. For this purpose, let us introduce the mean-square out-of-phase autocor- 
relation (MSAC), Rac, and mean-square crosscorrelation (MSCC), Rcc, respectively, 
of a given set ^ of size U as 
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7.4.2 Sidelobe Measures 

The figure of merit (FOM) of a sequence Ux G'^, I <x<U of length A^ with aperi- 
odic autocorrelation function Cx{l) measures the ratio of energy in the mainlobe to 
the energy in the sidelobe of the autocorrelation function. It is defined as 

FOM, = ^^^f^ ' ^^ (7-9) 

2 I |C.(Z)P 
1=1 

Alternatively, the integrated sidelobe level (ISL) is often used for radar applications 
in the context of distributed target environments. The ISL of a sequence Ux € '^, 
I <x<U of length A' is defined as 

«L.^^, V. (7.10) 

Another important measure in relation to radar applications is the peak-to-sidelobe 
ratio (PSLR) which relates to the ability of detecting targets without masking in- 
terfering targets. For example, if an AAC has large sidelobes, it will mask nearby 
targets and leave them undetected. Specifically, the PSLR of a sequence u^ measures 
the ratio of the in-phase value Cv(0) to the maximum sidelobe magnitude |Cv(/) | of 
the periodic autocorrelation function Cx{l). It is defined as 

PSLRx^ ^1?^,M ^ ^^ (7-11) 

max Cv(/) 

1</<A' 



7.5 Numerical Examples 

In the sequel, some numerical examples are provided to illustrate the application 
of meta-heuristic algorithms for waveform optimization for integrated radar and 
communication systems. For this purpose, we consider the class of Oppermann se- 
quences as defined in ( 17.41 ) of length A? = 31. It is noted that the maximum number 
of A^ — 1 = 30 sequences in the designed set is obtained as A'^ is chosen as a prime 
number. Furthermore, the considered sequence family offers parameters m and n for 
optimization given the case of parameter p = 1 . Accordingly, the following opti- 
mization problems may be posed: 

PI : min ISL{'^) (7.12) 

nG[«l,n2] 

P2: max PSLR{'W) (7.13) 

nG[H|,«2] 

ra: min [i?„,(^) + a/?,,.(^)] (7.14) 

mG[m| ,m2],«G[«] ,^2] 
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where m G [mj , m2\ and n G [mi , M2] are the search regions for m and n, respectively, 
and a is a weighting factor. While problems P\ and P2 given in ( I7.12I ) and ( 17.13b . 
respectively, relate strongly to radar applications, problem P3 formulated in (17.14b 
can be used to find a trade-off between conflicting objectives of radar and communi- 
cation applications. Especially, the weighting factor a may be chosen with respect 
to desirable system specifications. In contrast to ll25l , where we have used a two-step 
approach to first optimize autocorrelation properties by a simple brute-force search 
over parameter n followed by tuning m towards favorable delay-Doppler properties, 
we consider here two-dimensional optimization to simultaneously find the optimal 
values of n and m for problem P3. On the other hand, in view of the independence 
of the autocorrelation of Oppermann sequences on parameter m as shown in ( 17.5b . 
problems PI and P2 remain one-dimensional as PSLR and ISL only involve the 
aperiodic autocorrelation. 

In order to solve the problems formulated in (I7.12b - (l7.14b . we use APSO, HS 
ASA and ABC. The two-dimensional search space was constrained to the interval 
m £ [0,4] and n E [0,4]. The algorithms were executed on a laptop computer with 
Intel Pentium M 740 Processor running at 1 .73 GHz and 2048 Megabytes of RAM. 
With the exception of ASA, where we used a C-routine called from MATLAB, all 
the other algorithms have been implemented in MATLAB. As for the translation of 
the notions from meta-heuristics to the optimization problem at hand, the following 
interpretation can be given. 

• APSO: Initially, particles in a swarm are randomly distributed in a D-dimensional 
search space. In APSO, the parameter D refers to the dimension of the problem, 
swarm refers to a population, and particle is similar to an individual. Alterna- 
tively, each solution (or particle) flies through the search space and looks for an 
optimal position to land. In terms of Oppermann sequences, particles are repre- 
sented by the values of m and « in a two-dimensional search space and are used to 
generate the phases of Oppermann sequences as defined in (17.4b . The search for 
the optimal landing position, i.e. finding optimal values of m and n will continue 
until the criteria selected from (17.7b to ( 17.11b are met. 

• HS: Initially, harmonies are randomly generated in a D-dimensional space and 
are stored in a harmony memory (HM). The use of HM ensures that the best 
harmonies will be carried over to the HM. As for the optimization of Oppermann 
sequences, the parameters m and n are represented by the obtained harmonies to 
generate phases as defined in ( 17.4b . Then, pitch adjustment is used to control the 
convergence of the algorithm. Randomization introduced in the algorithm drives 
the algorithm to search previously unexplored areas in the search space until the 
criteria selected from ( 17.7b to ( 17.11b are met. 

• ASA: This algorithm starts with the initial guess of the parameters in the D- 
dimensional search space. In terms of Oppermann sequences, the initial guess 
represents values of the parameters m and n. Each step of the ASA algorithm re- 
places the current solution by a random nearby solution. The obtained solutions 
are used to generate Oppermann sequences. The process of finding optimal val- 
ues of m and n continues by generating feasible points in the search space and 
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acceptance probability including annealing and re-annealing temperatures until 
criteria selected from ( I7.7I ) to ( 17.11b are met. 

ABC: It is recalled that food sources are randomly distributed in the D-dimen- 
sional search space at the start of the search. Here, bees refer to a population of 
bees (employed, onlookers and scout) which are in the search of the best food 
position. Employed bees search for new food sources within their neighborhood 
that have more nectar compared to the food sources they have previously visited. 
These food sources represent the values of the parameters m and n of Oppermann 
sequences to generate the phases defined in ( 17.4b . If during the optimization pro- 
cess the criteria set for ( 17.7b to (17.11b are not met, it will represent abandoned 
food source or bad sequence designs.. The search for the final food position rep- 
resent optimal values of m and n that satisfy the criteria set for ( 17.7b to ( 17.111 ). 



Figure 17.11 compares the performance of Oppermann sequences obtained through 
meta-heuristics in terms of PSLR with the brute-force search method with fixed 
step size reported in l;25J . Clearly, the random search strategy employed in meta- 
heuristics widens the search area allowing the particles to explore the search space 
more effectively compared to an optimization using fixed step size. As can be seen 
from the figure, PSLR values can be improved for those prime length that would 
have inferior performance using brute-force search with fixed increment on n. In 
this case, meta-heuristic algorithms improve the performance of the designed set of 
Oppermann sequences to be comparable to other families such as the FZC sequences 
(see also ll25l ). 




20 40 60 80 

Prime Length 



100 120 



Fig. 7.1 Performance comparisons between brute-force search with fixed increment and 
meta-heuristic algorithms in terms of PSLR 



The convergence behavior of the considered algorithms for the example of opti- 
mizing PSLR is illustrated in Fig. 17.21 It can be seen from the progressions in terms 
of iterations shown in the figure that ASA achieves the fastest convergence to the 
optimal values followed by APSO, ABC and HS. The fast convergence of ASA may 
be attributed to the fact that exponential annealing permits the algorithm to adap- 
tively re-anneal and pacing the convergence in the search space in all dimensions. It 
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Fig. 7.2 Convergence of different meta-heuristic algorithms towards optimal PSLR: (a) 
APSO, (b) HS, (c) ASA, (d) ABC 



should be mentioned that the similar convergence behavior and ranking among the 
algorithms can be observed when applied to optimize FOM, ISL, and mean-square 
aperiodic correlation measures. 

Tables I7.ir a)-(e) show numerical results of optimal designs for Oppermann se- 
quences of length A^ = 3 1 with respect to the optimization problems posed in ( I7.12I ). 
dTTBI ). and (177741 using APSO, HS, ASA, ABC. As for the optimal designs pre- 
sented in Table 17. U a) and Table I7.ir b) for PSLR and ISL, respectively, it is suffi- 
cient to consider only the parameter n as these metrics involve only the AAC (see 
also ( I7.10I ) and ( 17.11b ). It is recalled that according to ( 17.5b . the AAC is independent 
of the parameter m for the considered case of parameter p = I- Also, all A? — 1=30 
Oppermann sequences in an optimized set achieve the same PSLR and ISL. Clearly, 
all considered meta-heuristic algorithms converge towards very similar results for 
these two classical design objectives of radar systems. 

In order to illustrate the trade-off in waveform optimization for integrated radar 
and communication systems, let us focus now on the results presented in Ta- 
bles I7.1f c)-(e) with respect to the optimization problem posed in (17.14b . In par- 
ticular, we have chosen a ~ relating to radar systems, a — 60 emphasizing on 
communication systems, and a = 1 as an example of an integrated radar and com- 
munication scenario. Clearly, the autocorrelation properties indicated by the small 
Rac values in Table 17. U c) are beneficial for radar systems and are independent of 
parameter m. On the other hand, good crosscorrelation characteristics are shown 
Table 17. U d) for use with communication systems but these come at the expense 
of poor autocorrelation properties quantified by high values of Rac- The results of 
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Table 7.1 Optimal designs for Oppermann sequences of length A' = 31 

(a) Peak-to-sidelobe ratio (b) Integrated sidelobe level 



Algorithm 


n 


PSLR 


APSO 


2.000 


11.735 


HS 


2.000 


11.734 


ASA 


2.000 


11.735 


ABC 


2.000 


11.735 



Algorithm 


n 


ISL 


APSO 


2.007 


0.110 


HS 


2.007 


0.110 


ASA 


2.000 


0.116 


ABC 


2.007 


0.110 



(c) MS AC; a = 



Algorithm 


m 


n 


Rac 


Rcc 


APSO 


2.597 


2.007 


0.110 


1.000 


HS 


2.744 


2.007 


0.110 


1.001 


ASA 


2.000 


2.000 


0.116 


1.000 


ABC 


0.614 


2.007 


0.110 


1.005 



(d) MSCC; a = 


60 




Algorithm 


m 


n 


Rac 


Rcc 


APSO 


1.003 


1.002 


19.676 


0.341 


HS 


1.003 


1.000 


19.677 


0.341 


ASA 


1.000 


1.000 


19.677 


0.344 


ABC 


1.003 


1.000 


19.677 


0.341 



(e) MSAC+MSCC; a = 1 



Algorithm 


m 


n 


Rac 


Rcc 


APSO 


0.930 


2.007 


0.110 


0.997 


HS 


1.000 


2.007 


0.110 


0.996 


ASA 


1.000 


2.000 


0.116 


0.996 


ABC 


0.999 


2.007 


0.110 


0.996 



the trade-off example shown in Table 17. U e) may perform favorable with integrated 
radar and communication systems keeping autocorrelation values low and driving 
crosscorrelation values smaller. An additional increase of a would result in an in- 
crease of autocorrlelation values and further reduce crosscorrelation values. Also, 
all four considered meta-heuristic algorithms provide very similar outcomes to the 
different optimization problems. 



7.6 Conclusions 



In this chapter, we have focused on the waveform optimization for integrated radar 
and communication systems. Given the conflicting requirements on autocorrelation 
and crosscorrelation characteristics, meta-heuristic algorithms are considered to ba- 
sically perform a multidimensional optimization. Specifically, the selected class of 
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Oppermann sequences allows for designing families with a wide range of 
correlations with respect to a two-dimensional search space. The numerical re- 
sults illustrate the potential of meta-heuristic algorithms for designing sequences 
for radar, communications, as well as integrated systems. By way of example with 
respect to PSLR, it is shown that meta-heuristics can improve performance com- 
pared to search methods with fixed increment. 
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Chapter 8 

Parameter Estimation from Laser Flash 

Experiment Data 



Louise Wright, Xin-She Yang, Clare Matthews, Lindsay Chapman, 
and Simon Roberts 



Abstract. Optimisation techniques are commonly used for parameter estimation 
in a wide variety of applications. The application described here is a laser flash 
thermal diffusivity experiment on a layered sample where the thermal properties 
of some of the layers are unknown. The aim is to estimate the unknown properties 
by minimising, in a least squares sense, the difference between model predictions 
and measured data. Two optimisation techniques have been applied to the problem. 
Results suggest that the classical nonlinear least-squares optimiser is more efficient 
than particle swarm optimisation (PSO) for this type of problem. Results have also 
highlighted the importance of defining a suitable objective function and choosing 
appropriate model parameters. 

8.1 Introduction 

Many components that operate in a high-temperature corrosive environment, such as 
engine parts and turbine blades, use coatings to increase their operational lifetime. 
In some cases these coatings are grown on the component by reaction (e.g. oxide 
layers), and in other cases they are separate substances applied to the surface of the 
component before it is put into operation. It is often difficult to obtain samples of the 
coating on its own, since the coating is often too thin and too fragile to be removed 
from the component in pieces of a usable size. 

Accurate prediction of the behaviour and, in particular, the lifetime of such com- 
ponents in operation can avoid unexpected component failure and hence reduce 
downtime and maintenance costs. Models for prediction of component lifetime 
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generally require a coupled thermal-mechanical analysis to predict stresses caused 
by differential thermal expansion and oxide layer growth. The thermal part of this 
analysis requires knowledge of the thermal conductivity of each material present 
within the component, including coatings. 

This chapter describes the application of an optimisation process to a finite vol- 
ume model of the laser flash experiment using a layered sample. The optimisation 
minimises the difference between the measured data and model predictions by ad- 
justing model parameters, including the thermal conductivity of one of the layers. 
The aim is to demonstrate that the thermal conductivity of a layer within a sample 
can be obtained using optimisation techniques. 

The process described shows the steps required for application of optimisation 
techniques to a real-world problem: data preparation, model development, choice 
of objective function and parameters, and choice of an appropriate optimisation 
method. The work reported also illustrates that each of these steps may be revis- 
ited repeatedly before a fit-for-purpose model is achieved. 

The laser flash experiment will be summarised in section 18.21 and the initial 
model used to simulate the experiment will be defined in section 18.31 The initial 
optimisation results obtained will be discussed in section [84l and subsequent alter- 
ations to the model will be explained in section [8".4.2l The final optimisation results 
will be discussed in section [8^.4.3l Our concluding remarks are given in section[] 



8.2 The Laser Flash Experiment 

The laser flash experiment measures the thermal diffusivity of materials. Thermal 
diffusivity is a measure of how quickly heat travels through a material and has units 
of m^ s^ ' . Thermal diffusivity a is related to density p , thermal conductivity A , and 
specific heat capacity Cp by the equation 

A 
a = , (8.1) 

PCp 

and so if the density and the specific heat capacity of a material can be obtained 
from other experiments, the thermal conductivity can be calculated from the thermal 
diffusivity. 

The laser flash experiment generates a set of temperature measurements gathered 
over time. The model used to determine the thermal diffusivity from the measure- 
ments is based on a number of assumptions, including the assumption that the ma- 
terial is uniform and isotropic. These assumptions are clearly not true for layered 
and coated materials such as the components described above. Since it is generally 
difficult to obtain samples of the coating that are sufficiently large to use in the laser 
flash experiment, a method of obtaining the thermal properties of each layer within 
a layered sample would enable the properties of the coating to be determined. 

The experiment exposes one circular face of a cylindrical sample of material to a 
pulse of laser light, and measures the temperature rise of the centre of the opposite 
circular face. The sample is placed in a furnace so that measurements can be carried 
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out at well-controlled temperatures. The sample is put in the furnace before the 
experiment starts and the experiment is not started until the furnace temperature 
and sample temperature are judged to be equal. The sample is supported by three 
small pins to minimise conductive losses, and the furnace is held in near-vacuum 
conditions to minimise convective losses. 

The laser flash is a pulse of laser light lasting less than 1 ms. The laser power 
is adjusted so that the maximum temperature rise in the sample caused by the laser 
flash is typically between 3 K and 4 K. This temperature rise gives a good signal-to- 
noise ratio on the detected signal, but is sufficiently small that radiative losses can 
be approximated well by linearisation and can be taken into account in a straightfor- 
ward manner when calculating the thermal diffusivity. The laser power used is not 
known by the user and cannot be obtained from the equipment. 

The temperature change of the rear face of the sample is measured throughout 
the experiment by an infra-red (IR) temperature sensor. The temperature sensor has 
a finite spot size and so the measurement is an average over an area rather than a 
value at a single point. 

In order to shield the temperature sensor from the laser flash, a guard cap with a 
window in it is placed over the end of the sample. The guard cap should not be in 
contact with the material sample since the conductive heat losses from the sample 
to the cap will affect the measured temperature and hence the calculated thermal 
diffusivity value. 

The measurement data set used in this work is shown in Fig. 18.11 This data set 
is used as target data in the work reported here, meaning that the aim of the opti- 
misation work was to generate model results that fit these data well. The measure- 
ment was carried out at a furnace temperature of 947.15 K, and Fig. lS.ll shows the 
change in temperature relative to the firnace temperature. This data set was chosen 
because i) the ambient temperature was sufficiently high that radiative losses would 
be significant, and so determination of emissivity would be a possibility, and ii) the 
same data set had been studied previously [ 1 1, giving values to which the calculated 
results could be compared. 

The measurement data set consists of temperature change measurements every 
1.568 ms. The measurements are continued for 2.373952 s after the laser flash, 
giving a total of 1515 measurements for time f > (the first measurement being at 
t = 0). It is assumed that the sample temperature has fully stabilised by the time that 
the laser is fired, and so the temperature measured at / = is taken to be the ambient 
(furnace) temperature. The time axis is scaled such that the laser was fired at f = 0. 
The small peak shortly after r = is caused by energy from the laser flash that has 
not been absorbed by the sample being measured by the temperature sensor. 

The simplest form of data analysis of these data ||9l is based on an analytical 
solution to the transient heat flow equation that assumes a uniform sample, an in- 
stantaneous uniform laser flash, and no heat losses from the sample. This approach 
leads to a I-D model for the heat flow, and solution of this model gives an expression 
of the form 



AT = AT„ 



(\+2^{-\Ytx^{~r?n^at/L^)\ (8.2) 
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Time (s) 



Fig. 8.1 Measured data set used for the work reported here 



where AT{t) is the temperature rise of the rear face at time t, AT^ is the maximum 
temperature rise, and L is the thickness of the sample. Defining fj « as the time 
taken for the temperature rise to reach half of its maximum value, which can be 
determined from the measured temperature values, gives 



X(-l)"exp(-«2^2„fi/2/L2 



n=l 



1 

4 



0. 



(8.3) 



This is a nonlinear equation that gives a in terms of known values. Solving the 
equation gives 

a = 0.138785— (8.4) 

h/2 

Subsequent work ll2l[5ir7l fTTII has developed corrections to the simple one-dimensional 
model to allow for the finite duration of the laser pulse and for radiant heat losses 
(including those from the curved faces). The methods of data processing that in- 
clude corrections still make a number of assumptions, including spatial uniformity 
and isotropy of sample properties, insignificant temperature-dependency of material 
properties during the experiment, spatial uniformity of the laser flash, and absence 
of conductive and convective heat losses. The first of these assumptions is clearly 
not the case for the layered samples of interest in this work. It will be shown in 
section [83] that the final assumption is not valid either. 

For the purposes of the modelling work, the sample is assumed to be perfectly 
cylindrical with plane parallel circular faces of radius 6 mm. The data shown in 
Fig. 18. II were gathered during the measurement of a layered sample. It is assumed 
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that the sample consists of two distinct layers. Each layer is assumed to be uniform 
and isotropic. The known sizes and properties of the layers in the sample are listed 
in table 18.11 The emissivity, £, is required for implementation of radiative bound- 
ary conditions. A dash in a cell indicates that the property is unknown and is to 
be determined using optimisation methods to minimise the difference between the 
measured data and the model predictions. As has been mentioned above, the power 
of the laser that generates the flash is unknown and also must be determined using 
optimisation. 



Table 8.1 Thicknesses and thermal properties of the layers. A dash in a cell indicates that the 
property is unknown and is to be determined using optimisation 



Material 


P92 steel 


Oxide 


Thickness (mm) 


2.0942 


0.2265 


Density (kg m^-') 


7871 


5015 


Specific heat capacity (J kg~' K~') 


1473.2 


934.8 


Thermal conductivity (W m^' K^') 


45.181 


- 


Emissivity 


0.8 


- 



Previous analysis and simulation relating to this sample have been described in 
an NPL report |T|. All properties used in the work reported here have been taken 
from that report or from the references therein. The full chemical composition of 
the steel is given in the earlier report. The oxide layer consists of two components, 
magnetite and iron/chromium spinel, but they have been treated as a single uniform 
substance in order to provide a simpler model for initial investigations. The model 
could easily be extended to account for more complex layered structures. 

8.3 Mathematical Model 



8.3.1 Governing Equations 

The model considers heat flow within the sample and assumes that the heat flow 
within the rest of the equipment is either irrelevant or can be taken into account 
via an appropriate choice of boundary conditions. Cylindrical polar coordinates r = 
{r,0,z} and total temperature (rather than temperature change relative to furnace 
temperature) will be used throughout. 

The temperature distribution within the sample obeys the transient heat flow 
equation 

dTir,t) 



P(r)cp(r)- 



dt 



= V.(A(r)Vr(r,0) + e(r,0 



(8.5) 



where r denotes a position within the sample, T{r,t) is the temperature at a point 
r and time t, and Q{r,t) is a heat source term that is used to account for the laser 
flash. 
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The domain is defined asO<r<^, O<0< 2n, <z<L where R is the radius 
of the sample and L is its thickness. It is assumed that the problem is axisymmetric 
so that variation with 6 can be neglected. This reduces the model to two dimensions, 
making it simpler and quicker to solve. 

The domain is split into two layers in the z direction. The properties of each layer 
are isotropic and uniform and the two layers are assumed to have a perfect thermal 
bond. The layers are of thickness zi (P92) and Z2 (oxide), with zi+Z2= L. Then the 
material properties of the two layers are dependent only on z and are given by 

The source term Q is assumed to affect a uniform layer (thickness Az) at the front 
of the sample directly. It is assumed that the flash is of equal intensity over its finite 
duration. If the duration is Iq and the intensity is / then 

A value for to is known from the experiment and a value for Az will be assigned 
when the numerical solution method is described. The value of to used in this work 
was 0.8 ms. 

The initial conditions assume that the sample is uniformly at the ambient temper- 
ature To at time 0, so that 

r(/-,z,0) = ro, 0<r<.R, 0<z<L. (8.11) 



8.3.2 Boundary Conditions 

The simplest modelling assumption for boundary conditions is that all cooling is 
due to radiation only. A straight-line fit to the cooling section of the curve shown 
in Fig. 18. II shows a cooling rate of approximately 0.42 K s^^. A simplified model 
assuming instantaneous uniform temperature change throughout the sample during 
cooling shows that the maximum radiative heat loss for a sample of this type at 
an ambient temperature of 947.15 K and a sample temperature of 952.15 K is ap- 
proximately 0.02 K s^^ (this value is likely to be an overestimate due to the model 
assumptions). The difference of an order of magnitude in cooling rates suggests that 
the sample must be losing heat via some mechanism in addition to radiation. 
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Considering the experimental set-up, the most likely source of extra heat loss 
from the sample is contact between the guard cap and the sample. This contact 
could be caused by thermal expansion of the sample, the sample being too large for 
the holder, or poor positioning of the sample within the holder. 

The cross section of the guard cap is shown in Fig. 18. 21 including dimensions. The 
cap has a window at its centre through which the sample temperature is measured 
and which is transparent to infrared radiation. The radiative losses of the sample 
pass through this window. It is assumed that there is a perfect thermal bond between 
the surface of the guard cap marked with a heavy line and the equivalent portion 
of the sample, and that the guard cap is uniformly at the ambient temperature. It 
is assumed that the window is not in contact with the sample since in reality it is 
slightly offset from the main part of the guard cap. These assumptions avoid the 
need to include the heat flow within the guard cap in the model and enable the 
conductive heat losses to be modelled as a boundary condition. 



"'f" 



15 
mm 



7.2 mm 



> ; c 4.3 mm 



Guard Cap 



window 



1.7mm 



1.5mm 



i< i> 

1.7mm 



Axis of rotational 
symmetry 



Fig. 8.2 Sketch of the guard cap geometry (not to scale). The dark area marks the region of 
contact between guard cap and sample 



The curved surfaces of the sample are assumed to be perfectly insulated. The 
temperature gradient along the axis of symmetry must be zero for axisymmetry to 
be valid. The flat face exposed to the laser is assumed to lose heat radiatively. These 
boundary conditions can be expressed as 



dr 
dT_ 
dr 
dT_ 
dz 
dT_ 
dz 



0, 0<z<L, 



r=Q 



= 0, 0<z<L, 



z=0 



ea(r(V,0,?)-r(f), 0<r<R 
= -ea{T\r,L,t)-To^), 0<r<r^ 



(8.12) 
(8.13) 
(8.14) 
(8.15) 



z=L 
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T{r,L,t)=To r„<r<R (8.16) 

where r^^ is the radius of the window in the guard cap and a is the Stefan-Boltzmann 
constant. 



8.3.3 Numerical Methods 

These equations, boundary conditions, initial conditions, and material properties 
fully define the two-dimensional model. The model cannot be solved analytically. 
A numerical approximation technique must be used instead. 

The technique used to solve the model numerically is based on the finite volume 
method. The structure and approach are described in detail elsewhere |IT]|6|. The 
work reported here has used a version of TherMol fT/SI, an NPL software package 
for multiphysics applications focussing on the diffusion equation, as the basis for 
the model. The software used has been adapted from a three-dimensional imple- 
mentation of TherMol. 

The finite volume mesh uses two different volume sizes Az in the z direction, 
one for each material. The oxide layer had Az = 0.0453 mm and the P92 steel had 
Az = 0.0419 mm. The latter value was used as the value of Az in the definition of 
Q{r,t) in ( 18.101 ). Auniformvolumesizeof zir = 0.1 mm was used in the r direction. 

The finite volume model calculates the change in temperature relative to the ini- 
tial temperature. This approach means that the rounding errors caused by the use 
of finite-precision arithmetic have little effect. The only feature of the model that 
requires the use of the true temperature is the calculation of radiative losses, and the 
apropriate formulation is used in that section of the software. 

An explicit time integration method has been used for the transient calculations 
for simplicity and ease of implementation. The time step was chosen by trial and 
error for a typical set of parameter values, and was then divided by 10 to ensure 
that the model would run for more extreme parameter value choices. No numerical 
stability problems have been encountered during the work. 

The results of interest from the calculations were the temperature changes of the 
rear face averaged over the spot size of the temperature sensor. It was assumed that 
the temperature sensor spot size was the same size as the window of the guard cap. 
The results were output at the same time intervals as the measurements to enable 
direct comparison. 

The software TherMOL has been used before with an optimisation routine to 
determine unknown properties of the laser flash experiment [ 1 j. The work reported 
used a one-dimensional model with radiative cooling only and an extension of the 
Nelder-Mead algorithm, COBYLA ITOl . able to handle constraints on the parameter 
values. When applied to the data shown in Fig. 18. II the optimisation process gave 
a thermal conductivity of 2.1 W m^' K^' for the oxide layer, and a laser power 
intensity of 1.6 x 10^ W m^^, but the model results did not fit the cooling part of 
the temperature curve at all well. 
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8.4 Optimisation Results 

The initial optimisation aimed to minimise the relative difference between the mea- 
sured temperatures and the model predictions by varying the thermal conductivity 
of the oxide layer, A2, the laser power /, and the emissivity of the oxide layer £2. 
For any optimisation problem, the formulation of the right objective function is im- 
portant. Here we intend to minimise errors, but errors can be defined as relative er- 
rors and absolute errors, which implies two different ways of defining the objective 
function. 



8.4.1 Initial Optimisation Results 

The objective function was initially defined as the root mean square average of the 
relative differences between the measured data and the calculated values: 



\ 



.|(,_n,a^,, 



where A^ = 1515 is the number of data points. T„ is the measured temperature rise 
(i.e. T — Tq) at time f„ , and f (tn ; A2 , /, £2 ) is the calculated averaged surface temper- 
ature rise over the spot size of the temperature sensor for a given set of values of the 
model parameters A2,/, £2- 

Two optimisation algorithms were used: the Levenburg-Marquardt algorithm 
within a trust-region fS"?! and a particle swarm optimisation algorithm (PSO) El. 
The Levenburg-Marquardt algorithm is an efficient local optimiser and will find 
global minima for smooth unimodal surfaces. The work reported here uses the 
Matlab Optimisation Toolbox function Isqnonlin, an implementation of the 
Levenburg-Marquardt algorithm designed for minimisation of least-squares func- 
tions such as ( I8.17I ). The PSO implementation used was developed at NPL. The 
technique is a global optimiser and can be used to check that the local optimiser's 
results are globally optimal. The PSO is expected to take more time to converge to 
the optimal solution than the local optimiser. 

The initial optimisation results gave a poor fit to the measurement data, particu- 
larly at later times. A typical set of results, generated by parameter values identified 
as optimal by both algorithms, is shown in Fig. 18.31 The lower of the plots shows a 
close-up of the first 0.15 seconds. The measured temperature changes here are very 
close to zero, and the model is a very good fit to these values. The small values of 
Tj for these initial times mean that the relative errors are very large, and so the fit of 
the model to the data at small times dominates the overall fit. 

This dominance of small measured temperature change values suggests that the 
root mean square of differences. 



N 



'^J,iT„-T{t„;X2,I,e2))\ (8.18) 
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Fig. 8.3 An inappropriate objective function leads to a poor fit of model results to measured 
data (upper figure), but an unnecessarily good fit to values around (lower figure) 
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would be a better choice of objective function. This function has been used to gen- 
erate all results in the rest of this chapter. Normally this sum would be weighted 
according to the uncertainties associated with the measurements, but here it is 
assumed that the uncertaintes were the same for all measurements. 

In general, if the target data contains values close to zero, absolute differences 
may be a better choice of objective function than relative differences. Whilst min- 
imisation of relative errors can be a good way to combine results of different types, 
the measured data may not be the best choice of scaling factor when measured values 
are close to zero. The problems experienced illustrate the importance of considering 
the construction of the objective function carefully. 

In order to check the sensitivity of the converged solution to the initial parameter 
estimates, five sets of optimisation runs were carried out using each optimisation 
algorithm. The runs started from randomly-generated points within the parameter 
search space. The optimal solutions found and the number of function evaluations 
required to find them are summarized in Table 1 8.2] 

The results in Table 18.21 show that both algorithms converge to the same opti- 
mal solution repeatedly. The PSO showed more variation within the five runs than 
1 sqnonl in, leading to a higher standard deviation for each of the parameters. The 
size of the standard deviation is strongly linked to the stopping criteria of the opti- 
misation algorithms. The deviations listed below are consistent with the algorithms 
effectively arriving at the same solution. As expected, PSO required more function 
evaluations (about 327) to converge than the efficient local optimiser (about 32). As 
stated in section [831 earlier work had found values of A2 = 2.1 W m^^ K^' and 
/ = 1.6 X 10^ W m^^, which is a change of about 25% in the value of A2. 



Table 8.2 Summary of optimisation results. Means and standard deviations of parameters 
calculated from 5 runs. Note that intervals specified here are ±one standard deviation 





PSO 


LSQnonlin 


Evaluations 


327 ±56 


32±3 


Mean/Std 


A2 = 2.83±0.23Wm-' K"' 

/=[1.69±0.12]xl0'* Wm-2 

£2 = 0.092 ±0.073 


2.82±0.005 Wm-' K"' 

[1.69±0.04]xl0'* Wm-2 

0.0000 ±0.002 



The covariance matrix, Va, associated with these parameter estimates has been 
calculated from the goodness of fit and Jacobian matrix, using the equation 



Va^ijh) 



-1 



1 



N- 



N 

1(7; 



-f(f„;A2,/,e2)) 



(8.19) 



where J is the Jacobian matrix and m — 3 since there are three parameters. The Ja- 
cobian matrix has been estimated using finite difference approximations since the 
objective function of this model is a black box. The standard uncertainties asso- 
ciated with the parameter estimates are given by the square root of the diagonal 
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entries of the resulting matrix. The standard uncertainties were, in the order A2, /, 
£2, 0.047 W m-i K \ 7.2 x 10^ W m-^, and 0.17. This reflects the high degree of 
uncertainty about the emissivity. The associated estimate of the goodness of fit was 
8.3 X 10-3 K. 

The results of the model obtained by using the optimal parameter values are 
shown in Fig. 18.41 These model results are clearly a better fit to the measured 
data than those shown in Fig. 18.31 illustrating the benefit of changing the objective 
function. The results are also a better fit to the measured values than the results ob- 
tained during the work described in [ 1 1, illustrating that the new model simulates the 
experiment better than the original version. 
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Fig. 8.4 Model results calculated using the best solutions found (A2 = 2.8229, / = 1.6903 x 
lO'^ and e = 0.0) 



There is still a discrepancy in the cooling part of the curve: the model re- 
sults appear to cool too fast. In addition, the value of e that has been found is 
unexpectedly low (physically the value must be between and 1, and was ex- 
pected to be close to 0.8). These observations suggest that the conductive losses 
through the guard cap over-estimate the true cooling, and that the conductive 
losses dominate the radiative losses to the point where emissivity cannot be 
determined. 
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8.4.2 Revision of the Model 

The results of the initial model suggest that whilst the inclusion of the effects of the 
guard cap improves the fit to the measured data, the model is still not predicting the 
cooling part of the measured data correctly. In order to improve the fitting of the 
cooling part of the data, the perfect bond between the guard cap and the sample is 
changed to an imperfect thermal bond parameterised by the unknown thermal bond 
quality parameter /3 with units W m^^ K^' . 

The imperfect bond is defined as an extra layer between the guard cap and the 
surface of the sample. The bond quality parameter is effectively the thermal conduc- 
tivity of the extra layer divided by its thickness. The boundary conditions describing 
the imperfect bond are generated by solving a one-dimensional steady state heat flux 
equation analytically at each point on the sample surface that is in contact with the 
guard cap, and imposing continuity of temperature and flux at the boundaries of the 
extra layer. This approach assumes that the only heat flow within the extra layer 
only occurs in the z direction, which is valid because the extra layer is not real and 
is only a simulation of a poor thermal bond. 

The imperfect thermal bond boundary condition is implemented as 



^dT{r,z,t) 



dz 



z=L 



liX/iAz/2)iTo-T{r,L-Az/2,t)) 



where r,^. < r < R. From this implementation it is clear that /3 = is a perfectly 
insulating boundary, and that as /3 -^ oo the condition tends towards a perfect thermal 
bond with T{r,L,t) — Tq as in the initial model. 

8.4.3 Optimisation Results with the New Model 

Following the successful application of the Levenbug-Marquardt algorithm to the 
initial version of the model, the optimisations using the new model with the im- 
perfect bond are carried out using Isqnonlin only. The new model uses four 
parameters and so it is expected that more function evaluations will be required to 
find a converged solution. The optimisation identified the best parameter values as 

• A2=3.55Wm-iK-i, 

• 7=1.67 xlO** Wm-2, 

• £2=1.0, 

• j3=1.92xl04Wm-2K-i. 

These values were obtained from five runs started from randomly-generated initial 
parameter estimates. The average number of function evaluations required for con- 
vergence was 126. The standard deviations of each of the parameter values across 
the five runs were less than 10^^, suggesting good repeatability. The covariance ma- 
trix was calculated using equation ( 18.19b . This calculation was complicated by the 
Jacobian estimates suggesting that the derivative of the objective function with re- 
spect to £2 at £2 = 1 was zero. This zero sensitivity meant that the emissivity could 
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not be included in the covariance calculation, so the covariance matrix only con- 
sidered A2, /, and p. The standard uncertainties found from the matrix were, in the 
order A2, /, and /3, 0.045 W m^i K"', 1.9 x 10^ W m'^, and 620 W m^^R-^. The 
associated estimate of the goodness of fit was 5.2 x 10^^ K, an improvement on the 
previous model. 

The model results obtained by using these parameter values are plotted in Fig. 
18.51 along with the measured data and the model results shown in Fig. I8.4l (dashed 
line). These plots show that the use of an imperfect thermal bond improves the fit 
of the model results to the measured data, particularly for t > 0.5. The old and new 
models are in close agreement for / < 0.3, which is the time period where the energy 
absorbed by the sample from the laser flash is likely to dominate the heat flow and 
differences in the value of A2 are less likely to have an effect as the oxide layer is 
comparatively thin. 
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Original Model 
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Fig. 8.5 Comparison of new model and original model and their best-fit curves 



8.5 Discussion 

Using the revised model, the value of A2 has increased by about 20 %, and that of 
£2 has gone from to 1 , whereas the value of / has not changed significantly. The 
parameter / defines how much energy goes into the sample during the laser flash, 
and the good fit of the model predictions to the peak temperature rise suggests that 
this value has been determined accurately. The value of A2 affects how the heat 
flows within the sample, so it is expected that a change in the boundary conditions 
would affect the optimal value of A2. Whilst the new value of £2 is closer to the 
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expected value of 0.8, it is by no means certain that this value is a good estimate of 
the true value. It is likely that the cooling due to the contact with the guard cap still 
dominates the heat loss, making determination of £2 difficult to determine. 

It is worth pointing out that the fit, though improved, is still not perfect. Ideally 
the differences between the model predictions and measured values would lie be- 
low the level of the measurement noise, but this level of agreement has not been 
achieved. There are differences between the model results and the measured values 
for 0.3 < t < 0.5 which suggest that further improvements could be made to the 
model. Possibilities for improvements include adding circumferential heat losses, 
considering an imperfect thermal bond between the sample and the oxide, adding 
extra layers to allow for the multi-phase nature of the oxide, and including a full 
model of the guard cap so that the conductive losses are modelled more accurately. 

It is clear from the results shown here that choosing the best optimisation al- 
gorithm can significantly reduce the number of function evaluations, leading to a 
reduction in computational time. There was no difference between the algorithms 
in terms of the accuracy of the estimates. The results suggest that the accuracy of 
the parameter estimates is constrained by the quality of the underpinning model and 
(were the model to be improved) the uncertainties associated with the measurement 
data. 

Recent trends suggest that metaheuristic algorithms such as PSO are increasingly 
widely used ll8l fT2ll . but popularity does not mean the algorithm is the best choice. 
In this case study, both PSO and nonlinear least squares provided very good results, 
but the classical, well-tested nonlinear squares required significantly fewer func- 
tion evaluations to reach a converged solution. The objective function in this case 
study was formulated in the least-squares sense, and the results suggest that a unique 
global optimum exists, which makes the least-squares optimiser more suitable than 
a metaheuristic algorithm. 

Experience gained in this case study and in other applications suggests that it is a 
good idea to use well-established algorithms when first solving a new optimisation 
problem. If the well-established algorithms fail, it is worth trying metaheuristic al- 
gorithms llT2l . This approach can avoid unnecessary and time-consuming trial and 
error. 
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Chapter 9 

Applications of Computational Intelligence in 

Behavior Simulation of Concrete Materials 

Amir Hossein Gandomi and Amir Hossein Alavi 



Abstract. The application of Computational Intelligence (CI) to structural engi- 
neering design problems is relatively new. This chapter presents the use of the CI 
techniques, and specifically Genetic Programming (GP) and Artificial Neural 
Network (ANN) techniques, in behavior modeling of concrete materials. We first 
introduce two main branches of GP, namely Tree-based Genetic Programming 
(TGP) and Linear Genetic Programming (LGP), and two variants of ANNs, called 
Multi Layer Perceptron (MLP) and Radial Basis Function (RBF). The simulation 
capabilities of these techniques are further demonstrated by applying them to two 
conventional concrete material cases. The first case is simulation of concrete com- 
pressive strength using mix properties and the second problem is prediction of 
elastic modulus of concrete using its compressive strength. 



9.1 Introduction 

Modeling of structural engineering nonlinear systems is a diverse research area 
where different kinds of methods can be utilized. Due to the large variety of 
this field, no method can impose itself as the best solution. Estimating both the 
structure and the parameters of the structural engineering problems makes their 
modeling process a difficult task. Different criteria for model classification can 
be characterized while dealing with a system modeling task [1]. A model can be 
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classified as phenomenological or behavioral [2]. A phenomenological model is 
derived by taking into account the physical relations governing the system. As a 
result, the structure of the model is selected according to the prior knowledge 
about the system. It is not always possible to design phenomenological models for 
many of the structural engineering systems due to their complexity. In order to 
overcome such a problem, the behavioral models are commonly employed. Such 
models approximate the relationships between the inputs and outputs based on a 
measured set of data without a need to prior knowledge about the mechanism that 
produced the experimental data. The behavioral models can provide very good re- 
sults with a minimal effort [2]. Traditional statistical regression techniques are 
commonly used for the behavioral modeling purposes. The regression analysis can 
have large uncertainties. It has major drawbacks for idealization of complex proc- 
esses, approximation, and averaging widely varying prototype conditions. Another 
important issue is due to the limitation of this method. The regression analysis 
tries to model the nature of the corresponding problem by a pre-defined linear or 
nonlinear equation. Another major constraint in application of the regression 
analysis is the assumption of normality of residuals. 

In the case of the behavioral models, several alternative computer-aided pattern 
recognition and data classification approaches have been developed. Computa- 
tional intelligence (CI) [3] techniques are well-known pattern recognition meth- 
ods. Developments in the computer hardware during the last two decades have 
made it much easier for these techniques to grow into more efficient frameworks. 
In addition, various Cl-based approaches may be used as efficient tools in prob- 
lems where conventional approaches fail or perform poorly. Artificial neural net- 
works (ANNs) are the most widely-used CI methods. ANNs have been used for a 
wide range of structural engineering problems (e.g. [4]). In spite of the successful 
performance of ANNs, they usually do not give a deep insight into the process 
which they use the available information to obtain a solution. In the present study, 
the approximation ability of two of the most widely used ANN architectures, 
namely Multi Layer Perceptron (MLP) and Radial Basis Function (RBF) are 
investigated. 

Genetic algorithm (GA) is a powerful stochastic search and optimization me- 
thod based on the principles of genetics and natural selection. GA has been shown 
to be suitably robust for a wide variety of complex civil engineering problems 
(e.g. [5]). Genetic programming (GP) [6] is an alternative approach for behavior 
modeling of geotechnical engineering tasks. GP is a developing subarea of evolu- 
tionary algorithms inspired from the Darwin's evolution theory. It may generally 
be defined as a specialization of GA where the solutions are computer programs 
rather than fixed-length binary strings. The programs generated by traditional GP 
are represented as tree structures and expressed in the functional programming 
language [6]. This classical GP approach is referred to as Tree-Based GP (TGP). 
For the last ten years, traditional GP has been pronounced as an alternative method 
for simulating the behavior of civil engineering problems (e.g. [7]). Linear genetic 
programming (LGP) [8] is a new subset of GP with a linear structure similar to the 
DNA molecule in biological genomes. More specifically, LGP operates on pro- 
grams that are represented as linear sequences of instructions of an imperative 
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programming language [8, 9]. In contrast with traditional GP and ANNs, applica- 
tion of LGP in the field of civil engineering is totally new and original, (e.g., [10, 
11]). 

This chapter presents the feasibility of using TOP, LGP, MLP, and RBF for 
behavior simulation of concrete materials. To verify the capabilities of these tech- 
niques, they are applied to the modeling of compressive strength and elastic mod- 
ulus of concrete. The chapter is organized as follows: Section 9.2 presents the ba- 
sic aspects and the characteristics of the employed algorithms. The modeling 
process and parameters settings for the methods are given in Section 9.3. Numeri- 
cal examples and the obtained results are presented in Section 9.4. Section 9.5 pre- 
sents a discussion of the capabilities of the methods. Finally, some concluding re- 
marks are provided in Section 9.6. 



9.2 Computational Intelligence 

Evolutionary algorithms (EAs) [12] are a subset of evolutionary computations. 
They use biology-inspired mechanisms to optimize a solution with regard to de- 
sired result. Computational intelligence (CI) [3] includes EAs and all of their dif- 
ferent branches with artificial neural networks and fuzzy logic. The CI techniques 
have wide ranging applications for approximating the nonlinearities. A survey 
of the literature reveals the growing interest of the research community in the rela- 
tively new field of computational intelligence. In the following subsections, 
different branches of the CI techniques employed in this research are briefly 
introduced. 



9.2.1 Genetic Programming 

GP is a symbolic optimization technique that creates computer programs to solve a 
problem using the principle of Darwinian natural selection [6] . The breakthrough 
in GP then came in the late 1980s with the experiments on symbolic regression. 
GP was introduced by Koza [6] as an extension of GA. Most of the genetic opera- 
tors used in GA can also be implemented in GP with minor changes. The main dif- 
ference between GP and GA is the representation of the solution. GA creates a 
string of numbers that represent the solution. The GP solutions are computer pro- 
grams represented as tree structures and expressed in a functional programming 
language (like LISP) [6, 10, 20]. In other words, in GP, the evolving programs 
(individuals) are parse trees than can vary in length throughout the run rather than 
fixed-length binary strings. Essentially, this is the beginning of computer pro- 
grams that program themselves [6]. Since GP often evolves computer programs, 
the solutions can be executed without post-processing, while coded binary strings 
typically evolved by GA require post-processing. The traditional optimization 
techniques, like GA, are generally used in parameter optimization to evolve the 
best values for a given set of model parameters. GP, on the other hand, gives the 
basic structure of the approximation model together with the values of its parame- 
ters [13]. GP optimizes a population of computer programs according to a fitness 
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landscape determined by a program ability to perform a given computational task. 
The fitness of each program in the population is evaluated using a fitness function. 
Thus, the fitness function is the objective function GP aims to optimize [14]. 

The classical GP technique is also referred to as tree-based GP (TGP) [6]. In 
TGP, a random population of individuals (computer programs) is created to 
achieve high diversity. A population member in TGP is a hierarchically structured 
tree comprising functions and terminals. The functions and terminals are selected 
from a set of functions and a set of terminals. For example, the function set F can 
contain the basic arithmetic operations (+, -, x, /, etc.). Boolean logic functions 
(AND, OR, NOT, etc.), or any other mathematical functions [10,20]. The terminal 
set T contains the arguments for the functions and can consist of numerical con- 
stants, logical constants, variables, etc. The functions and terminals are chosen at 
random and constructed together to form a computer model in a tree-like structure 
with a root point with branches extending from each function and ending in a ter- 
minal. An example of a simple tree representation of a TGP model is illustrated in 
Fig. 9.1. 



l-'u fiction rtl >'ode 



Term in »1 Nudri 

Fig. 9.1 The tree representation of a TGP model (X, + S/Xz)^ (After [10]) 

Once a population of individuals (models) has been created at random, the TGP 
algorithm evaluates the fitness of individuals, selects individuals for reproduction, 
and generates new individuals by reproduction, crossover and mutation [6]. The 
reproduction operation gives a higher probability of selection to more successful 
individuals. They are copied into the next generation without any change. The 
crossover operation ensures the exchange of genetic material between the evolved 
programs. During the crossover procedure, a point on a branch of each solution 
(program) is selected at random and the set of terminals and/or functions from 
each program are then swapped to create two new programs. Fig. 9.2 shows a 
typical crossover operation of two computer programs consisting of several func- 
tion and terminal genes. Two new child computer programs (Child 1, Child II) are 
generated from two parental computer programs (Parent I, Parent II). In Fig. 9.2, 
the randomly generated crossover points are shown by dotted lines. It can be seen 
that both child organisms include the genetic material from their parents. It is nec- 
essary to preserve syntactic structure of the programs during the crossover 
process. 
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Fig. 9.2 Typical crossover operation in TGP 

During this mutation process, the TGP algorithm occasionally selects a function 
or terminal from a model at random and mutates it. The mutation operation can be 
applied to the function or terminal nodes. A node in the tree is selected at random. 
If the selected node is a terminal, it is replaced by another terminal. If the node is a 
function and point mutation is to be applied, it is replaced by a new function with 
the same parity. If a tree mutation is to be performed, a new function node, 
which is not necessarily with the same parity, is chosen. Then, the original node 
together with its relative sub-tree is replaced by a new randomly created sub-tree. 
Fig. 9.3 illustrates a typical mutation operation in TGP. The best program that 
appeared in any generation, the best-so-far solution, defines the output of the GP 
algorithm [6]. 




Fig. 9.3 Typical mutation operation in GP 



In addition to traditional tree-based GP, there are other types of GP where pro- 
grams are represented in different ways. These are linear and graph-based GP 
[15]. The emphasis of the present study is placed on the linear GP techniques. 
Several linear variants of GP have recently been proposed such as linear genetic 
programming (LGP) and multi-expression programming (MEP). The linear vari- 
ants of GP make a clear distinction between the genotype and phenotype of an in- 
dividual. In these variants, individuals are represented as linear strings [11,16]. 
Such linear programs can have a complex control flow similar to the trees of 
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standard GP when executed. There are some main reasons for using linear GP. 
Basic computer architectures are fundamentally the same now as they were twenty 
years ago, when GP began. Almost all architectures represent computer programs 
in a linear fashion. In other words, computers do not naturally run tree-shaped pro- 
grams. Hence, slow interpreters have to be used as part of tree-based GP. Con- 
versely, by evolving the binary bit patterns, the use of an expensive interpreter (or 
compiler) is avoided. Consequently, the linear GP methods can run several orders 
of magnitude faster than comparable interpreting systems [11,17]. The enhanced 
speed of the linear variants of GP (e.g., LGP) permits conducting many runs in re- 
alistic timeframes. This leads to deriving consistent, high-precision models with 
little customization [18]. 

9.2.1.1 Linear Genetic Programming 

LGP is a subset of GP with a linear representation of individuals. The main char- 
acteristic of LGP in comparison with traditional tree-based GP is that expressions 
of a functional programming language (like LISP) are substituted by programs of 
an imperative language (like C/C-H-) [8, 9]. Fig. 9.4 presents a comparison of the 
program structures in LGP and tree-based GP. As shown in Fig. 9.4(a), a linear 
genetic program can be seen as a data flow graph generated by multiple usage of 
register content. That is, on the functional level the evolved imperative structure 
denotes a special directed graph. As can be observed from Fig. 9.4(b), in tree- 
based GP, the data flow is more rigidly determined by the tree structure of the 
program [9,11]. 
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Fig. 9.4 Comparison of the GP program structures, (a) LGP (b) Tree-based GP (after [19]) 



In the LGP system described here, an individual program is interpreted as a 
variable-length sequence of simple C instructions. The instruction set or function 
set of LGP consists of arithmetic operations, conditional branches, and function 
calls. The terminal set of the system is composed of variables and constants. The 
instructions are restricted to operations that accept a minimum number of con- 
stants or memory variables, called registers (r), and assign the result to a destina- 
tion register, e.g., ro := r] + 1. A part of a linear genetic program in C code is rep- 
resented in Fig. 9.5. In this figure, register r[0] holds the final program output. 
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void LGP {double r[5]) 

r[0] = r[5] + 70; 
r[5] = r[0] - 50; 
if(r[l]>0) 
if(r[5]>2) 
r[4] = r[2]xr[l]; 
r[2] = r[5] + r[4]; 
r[0] = sin(r[2]); 



Fig. 9.5 An excerpt of a linear genetic program 

Here are the steps the LGP system follows for a single run [8, 11, 20]: 

1 . Initializing a population of randomly generated programs and calculating their 
fitness values. 

2. Running a Tournament. In this step four programs are selected from the popu- 
lation randomly. They are compared and based on their fitness, two programs 
are picked as the winners and two as the losers. 

3. Transforming the winner programs. After that, two winner programs are copied 
and transformed probabilistically into two new programs via crossover and mu- 
tation operators. 

4. Replacing the loser programs in the tournament with the transformed winner 
programs. The winners of the tournament remain without change. 

5. Repeating steps two through four until termination or convergence conditions 
are satisfied. 

Crossover occurs between instruction blocks. Fig. 9.6 demonstrates a two-point 
linear crossover used in LGP for recombining two tournament winners. As it is 
seen, a segment of random position and arbitrary length is selected in each of the 
two parents and exchanged. If one of the two children would exceed the maximum 
length, crossover is aborted and restarted with exchanging equally sized segments. 
The mutation operation occurs on a single instruction. Two types of standard 
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Fig. 9.6 Crossover in LGP [9] 
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LGP mutations are commonly used: micro and macro mutation. Micro mutation 
ciianges an operand or an operator of an instruction. The macro mutation opera- 
tion inserts or deletes a random instruction [9]. Comprehensive descriptions of the 
basic parameters used to direct a search for a linear genetic program can be found 
in [8, 20]. 

9.2.2 Artificial Neural Network 

ANNs have emerged as a result of simulation of biological nervous system. The 
ANN method was founded in the early 1940s by McCulloch and co-workers [21]. 
The first researches were focused on building simple neural networks to model 
simple logic functions. At the present time, ANNs can be applied to problems that 
do not have algorithmic solutions or problems with complex solutions. ANN for- 
mulates a mathematical model for a system in which no clear relationship is avail- 
able between inputs and outputs. ANNs use the data alone to determine the struc- 
ture of the model and unknown model parameters. The ability of ANNs to learn 
by example makes them very flexible and powerful techniques. Thus, this ap- 
proach has widely been applied to solving regression and classification problems 
in many fields. In this study, the approximation ability of two of the most well- 
known ANN architectures, MLP and RBF, are investigated. 

9.2.2.1 Multilayer Perceptron Network 

MLPs are a class of ANN structures using feed forward architecture. The MLP 
networks are usually applied to perform supervised learning tasks, which involve 
iterative training methods to adjust the connection weights within the network. 
MLPs are universal approximators, that is, they are capable of approximating es- 
sentially any continuous function to an arbitrary degree of accuracy. They are of- 
ten trained with back propagation (BP) [22] algorithm. Fig. 9.7 shows a schematic 
representation of an MLP network. MLP consist of an input layer, at least one 
hidden layer of neurons and an output layer. Each of these layers has several proc- 
essing units and each unit is fully interconnected with weighted connections to 
units in the subsequent layer. Each layer contains a number of nodes. Every input 
is multiplied by the interconnection weights of the nodes. The output ihj) is ob- 
tained by passing the sum of the product through an activation function as follows: 



^ = / Z-«."'.^+^ 



(9.1) 



th 



where /() is activation function; x, is the activation of / hidden layer node; Wy is 
the weight of the connection joining the j neuron in a layer with the /* neuron in 
the previous layer, and b is the bias for the neuron. For nonlinear problems, the 
sigmoid functions (Hyperbolic tangent sigmoid or Log-sigmoid) are usually 
adopted as the activation function. Adjusting the interconnections between layers 
will reduce the following error function: 
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where 4" and ht" are respectively the calculated output and the actual output value, 
n is the number of sample and k is the number of output nodes. Further details of 
MLPs can be found in [23, 24]. 
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Fig. 9.7 A schematic representation of an MLP network 

9.2.2.2 Radial Basis Function 

RBFs have feed forward architectures. Compared to other ANN structures such as 
MLPs, the RBFs procedure to find complex relationships is generally faster and 
their training is much less computationally intensive. The schematic representation 
of the RBF network is illustrated in Fig. 9.8. The structure of the RBF network 
consists of an input layer, a hidden layer with a non-linear RBF activation func- 
tion, and a linear output layer. Input vectors are transformed into radial basis func- 
tions by means of the hidden layer. 
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Fig. 9.8 A schematic representation of RBF network 



The transformation functions used are based on a Gaussian distribution as an 
activation function. Center and width are two important parameters that are related 
to the Gaussian basis function. As the distance, that is usually Euclidean distance, 
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between the input vector and its center increases the output given by the activation 
function decays to zero. The rate of decrease in the output is controlled by the 
width of RBF. The Gaussian basis function (c) is given in the following form: 



C/x) = exp 
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(9.3) 



where Ihll is the Euclidian norm, x is the input pattern, and fij and aj are the center 



th 



and the spread of the Gaussian basis function respectively. The output of k neu- 
ron in the output layer of network is computed as: 

(9.4) 
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in which n is the number of the hidden neurons, Wjt is the weight between j hid- 
den neuron and A:* output neuron and b/, is the bias term. The RBF networks with 
Gaussian basis functions have been shown to be universal function approximators 
with high pointwise convergency [25]. 

9.3 Modeling Process and Parameters Setting 
9.3.1 Model Development Using GP-Based Methods 

Various parameters are involved in the TGP and LGP predictive algorithm. Sev- 
eral runs were conducted to come up with a parameterization of TGP and LGP 
that provided enough robustness and generalization to solve the problems. In this 
study, basic arithmetic operators and mathematical functions were utilized to get 
the optimum GP models. The number of programs in the population that TGP and 
LGP will evolve is set by the population size. A run will take longer with a larger 
population size. The maximum number of tournaments sets the outer limit of the 
tournaments that will occur before the program terminates the run. The proper 
number of population and tournaments depends on the number of possible solu- 
tions and complexity of the problem. Different levels were tested for the number 
of population and tournaments to find models with minimum error. The program 
was run until there was no longer significant improvement in the performance of 
the models or the runs terminated automatically. The mutation rate was set to 
90%. At the low level the crossover rate is 50% and at the high level it is 95%. 
The values of the other involved parameters were selected based on some previ- 
ously suggested values [6, 11, 20, 26] and also after a trial and error approach. 
Different parameter combinations were tested and 10 replications for each were 
carried out. For each of the problems, the overall number of runs was equal to 60 
for each of the TGP and LGP algorithms. For the TGP analysis, a MATLAB tool- 
box, namely GPLAB [27] was used. The LGP algorithm was implemented using 
the Discipulus software [28]. 
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Overfitting is one of the principal problems in machine learning generalization. 
An efficient approach to prevent overfitting is to test other individuals from the 
run on a validation set to find a better generalization [15]. This technique was used 
in this study for improving the generalization of the models. For this purpose, the 
available data sets were randomly divided into learning, validation and testing 
subsets. The learning data were used for training (genetic evolution). The valida- 
tion data were used to specify the generalization capability of the evolved pro- 
grams on data they did not train on (model selection). In other words, the learning 
and validation data sets were used to select the best evolved programs and in- 
cluded in the training process. Thus, they were categorized into one group referred 
to as "training data". The testing data were finally used to measure the perform- 
ance of the models obtained by TGP and LGP on data that played no role in build- 
ing the models. A trial study was conducted to find a consistent data division. The 
selection was such that the statistical properties (e.g. mean and standard deviation) 
of the training and testing subsets were similar. 

9.3.2 Model Development Using ANN-Based Methods 

For the development of the MLP and RBF models, two scripts were written in the 
MATLAB environment using Neural Network Toolbox 5.1 [29]. The performance 
of an ANN model mainly depends on the network architecture and parameter set- 
tings. For the traditional MLP, a single hidden layer network is sufficient to uni- 
formly approximate any continuous and nonlinear function according to a univer- 
sal approximation theorem [23]. Choice of the number of the hidden layer nodes, 
learning rate, epochs and types of activation function plays an important role in 
the construction of the MLP and RBF models. Hence, several network models 
with different settings for the mentioned characters were trained to reach the opti- 
mal configurations with the desired precision [30]. The written program automati- 
cally tries various numbers of neurons in the hidden layer and reports the R and 
MAE values. The data division for the MLP and RBF analyses was similar to that 
considered for TGP and LGP. 



9.3.3 Finding the Optimum Models 

The best models were chosen on the basis of a multi-objective strategy as follows: 

• The simplicity of the models, although this was not a predominant factor. 

• Providing the best fitness value on the learning set of data. 

• Providing the best fitness value on a validation set of data. 

The first objective can be controlled by the user through the parameter settings 
(e.g., program size for GP and hidden layer neurons for ANN). For the other ob- 
jectives, the following objective function (OBJ) was constructed as a measure of 
how well the model predicted output agrees with the experimentally measured 
output. The selection of the best models was deduced by the minimization of the 
following function: 
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where No. Learning, No. Validation and No.Training are respectively the number of learning, 
validation and training data; R and MAE are respectively correlation coefficient 
and mean absolute error given in the form of formulas as follows: 
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in which /z, and /, are respectively actual and calculated outputs for the /"' output, 
h^ is the average of the actual outputs, and n is the number of sample. It is well 

known that the R value alone is not a good indicator of prediction accuracy of a 
model. This is because that by shifting the output values of a model equally, the R 
value will not change. The constructed objective function takes into account the 
changes of R and MAE together. Higher R values and lower MAE values result in 
lower OBJ and, consequently, indicate a more precise model. In addition, the 
above function considers the effects of different data divisions for the learning and 
validation data. 



9.4 Case Studies 



9.4.1 Compressive Strength of Concrete 

The performance characteristics of concrete are major concerns in construction of 
civil engineering applications. The enhanced performance characteristics of con- 
crete are generally achieved by addition of various cementitious materials and 
chemical and mineral admixtures to the conventional concrete mix designs. Ac- 
cording to the Abrams' well-known rule, the correlation of the strength of con- 
crete with the water to cement ratio is negative. This rule indicates that only the 
quality of the cement paste controls the strength of comparable cement. Based on 
a variety of experimental studies, this is not quite true. For example, if two compa- 
rable concrete mixtures have the same water to cement ratio, the strength of the 
concrete with the higher cement content is lower [31]. Several studies have inde- 
pendently shown that concrete strength development is determined not only by the 
water to cement ratio, but that it is also influenced by the content of other ingredi- 
ents [32]. Advances in recent years have been assisted by the use and understand- 
ing of chemical admixtures, notably super plasticizers, and cement replacement 
materials, notably fly ash, blast furnace slag, etc. The use of fly ash and slag plays 
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an important role in contributing to a better workability and low slump loss rates 
of concrete. This is due to the mutual containment with surface lubrication and the 
ball-bearing effects among the fly ash and micro fine materials. In many cases, 
there is also the economic benefit of the price differential between cement and the 
supplementary cementitious material. Additionally, partial replacement of cement 
nearly always allows a significant reduction in the dosage of the super plasticizer, 
which is a particularly expensive ingredient [33]. 

9.4.1.1 Modeling 

In its current state, behavior modeling of the compressive strength of concrete 
containing additives is inherently more difficult than for the conventional con- 
crete. In order to provide accurate assessment of the performance characteristics of 
the concrete mix, the effects of several parameters should be incorporated into the 
model development. Therefore, in this study, the GP and ANN-based approaches 
were utilized to obtain meaningful relationships between the compressive strength 
(fc) of concrete mixes and the predictor variables as follows: 

/.=/[f,B,F,S,f,Ln(A)l (9-8) 

where, 

W/C: Water to cement ratio 

%B: Blast furnace slag content 

%F: Fly ash content 

%S: Superplasticizer content 

Ca/Fa: Coarse aggregate to fine aggregate ratio 

Ln(A): Natural logarithm of age 

The above variables were chosen as the input variables on the basis of a literature 
review [33-35]. 

It is known that the models derived using the GP, ANNs or other CI ap- 
proaches, in most cases, have a predictive capability within the data range used for 
their development. Thus, the amount of data used for the training of these algo- 
rithms is an important issue, as it heavily bears on the reliability of the final mod- 
els. The only way to overcome this limitation is to employ comprehensive data 
sets for training their algorithms. Hence, a reliable database consisting of tests on 
mixtures with a wide range of aggregate gradation and properties was obtained 
from the literature to develop the generalized models. The database contains 1133 
compressive strength of concrete test results presented by Yeh [34, 35]. It includes 
measurements of water (W), cement (C), blast furnace slag (B), fly ash (F), super- 
plasticizer (S), coarse aggregate (CA), fine aggregate (FA), age of specimens (A) 
and fc of concrete mixes. To visualize the distribution of the samples, the data are 
presented by frequency histograms (Fig. 9.9). 
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Fig. 9.9 Histograms of the variables used in the model development 

Some of the HPC property variables may be fundamentally interdependent. The 
first step in the analysis of interdependency of the data is to make a careful study 
of what it is that these variables are measuring, noting any highly correlated pairs. 
High positive or negative correlation coefficients between the pairs may lead to 
poor performance of the models and difficulty in interpreting the effects of the ex- 
planatory variables on the response. This interdependency can cause problems in 
analysis as it will tend to exaggerate the strength of relationships between vari- 
ables. This is a simple case commonly known as the problem of multicollinearity 
[36]. Thus, the correlation coefficients between all possible pairs were determined 
and shown in Table 9.1. As it is seen, there are not high correlations between the 
predictor variables. For the analysis, 907 values (80%) of the data were taken for 
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Table 9.1 Correlation coefficients between all pairs of the explanatory variables 



Variable 




W/C 


B(%) 


F(%) 


S(%) 


CA/FA 


Ln(Age) (day) 


W/C 




1.000 


0.341 


0.337 


-0.141 


-0.091 


0.044 


B(%) 




0.341 


1.000 


-0.275 


0.045 


0.065 


-0.026 


F(%) 




0.337 


-0.275 


1.000 


0.393 


-0.078 


0.000 


S(%) 




-0.141 


0.045 


0.393 


1.000 


-0.255 


-0.060 


CA/FA 




-0.091 


0.065 


-0.078 


-0.255 


1.000 


0.056 


Ln(Age) (d; 


^y) 


0.044 


-0.026 


0.000 


-0.060 


0.056 


1.000 



the training process (807 sets for learning and 100 sets for validation). The rest of 
the data were used for the testing of the generalization capability of the models. 

The maximum tree depth and program size for the optimal TGP and LGP mod- 
es were respectively equal to 30 and 256. Various training algorithms were im- 
plemented for the training of the MLP networks. The best results were obtained by 
Quasi-Newton back-propagation method. Also, hyperbolic tangent sigmoid was 
adopted as the transfer function between the input and hidden layer. The transfer 
function between the hidden layer and output layer was a linear transfer function. 
The best MLP model was built with one hidden layer with 18 hidden neurons, a 
learning rate of 0.05 and was trained for 1000 epochs. For the RBF analysis dif- 
ferent spreads were checked and the optimum one was equal to 4.4. 

9.4.1.2 Comparison of the Results 

The compressive strength prediction equations for the best results of the TGP and 
LGP algorithms are given as follows: 

/,,^^,(M/^«) = ;| (F-F4f |^-FLn(A)+fF-FB^YLn(A)(Ln(A)-F5)-Fl3Bf^-FF-FLn(A)| | 1 1 + 1 (9.9) 



/„,„/MPa) = ;|Ln(A|36F+36B-^^^l-||j + g + 5J-^-Ln(A)-S + 7 (9.10) 

Comparisons of experimental versus predicted compressive strength values using 
TGP, LGP, MLP and RBF are illustrated in Fig. 9.10. The other performance sta- 
tistics of these models is presented in Table 9.2. It is notable that no rational 
model to predict the compressive strength of HPC mixes has yet been developed 
that would encompass the influencing variables considered in this study. There- 
fore, it was not possible to conduct a more comprehensive comparative study 
herein. The results indicate that the TGP, LGP, MLP and RBF models are able to 
predict the compressive strength with high degree of accuracy. Comparing the 
performance of the GP-based methods, it can be observed from Fig. 9.10 and Ta- 
ble 9.2 that LGP has produced better outcomes than TGP. The ANN-based 
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techniques have provided better results than TGP and LGP. The best results for 
the training and testing data are respectively obtained by RBF and MLP. 
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Fig. 9.10 Histograms of the CI models (a) TGP, (b) LGP, (c) MLP, and (d) RBF 
Table 9.2 Statistical performances of the CI models 



Model 




Training data 






Testing data 




R 


MAE 


Ave. 


Std. 


R 


MAE 


Ave. 


Std. 


OP techniques 
TGP 


0.873 


5.97 


0.980 


0.223 


0.881 


6.14 


0.979 


0.233 


LGP 


0.894 


5.70 


0.969 


0.218 


0.906 


5.71 


0.983 


0.232 


ANN techniques 
MLP 


0.943 


3.99 


1.018 


0.260 


0.935 


4.31 


0.992 


0.278 


RBF 


0.956 


3.44 


1.005 


0.207 


0.930 


4.37 


1.043 


0.232 



9.4.2 Elastic Modulus of Concrete 

The elastic modulus of concrete is a key factor in structural and material engineer- 
ing. Designers need the elastic modulus for estimating immediate and time- 
dependant deformation, determining modular ratio, and evaluating the stiffness of 
buildings and members. The modulus of elasticity is also important in reinforced 
and pre-stressed concrete for creep and shrinkage evaluation, as well as in crack 
control, especially at an early age [26,37]. The modulus of elasticity can be de- 
rived from the stress-strain responses of concrete under compression. The mod- 
ulus of elasticity is defined in the region in which Hooke's law is obeyed for the 
material as the ratio of stress over strain [38]. In mechanics, Hooke's law of elas- 
ticity is an estimation that states that the amount of strain is linearly related to the 
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stress. This can be determined from the slope of compressive stress-strain curves. 
As shown in Fig. 9.11, in a typical stress-strain diagram of concrete, the first part 
of the curve is nearly a straight line with some curvature at o, which is equal to 
half of the maximum value, a^- The initial slope of the stress-strain curve defines 
the initial or tangent modulus used with the parabolic stress method. The slope of 
the chord connecting the origin of the coordinate system to 0.5Ou determines the 
secant modulus of elasticity, which is generally used in straight-line stress calcula- 
tion [26,39]. 



o„/2 




Fig. 9.11 Typical stress-strain diagram of concrete 

Despite its importance, tensile strength (and elastic modulus) is not usually 
measured in the site for compliance purposes. It is often estimated from the meas- 
ured compressive strength based on the empirical relationships proposed by vari- 
ous codes of practice. This is mainly to avoid performing laborious and time- 
consuming direct measurements from load-deformation curve [40]. 



9.4.2.1 Modeling 

The GP and ANN-based approaches were employed to formulate the elastic 
modulus (Ec) of NSC and HSC in terms of compressive strength (f^) as follows: 



■fil) 



(9.11) 



An experimental database of previously published test results [41] was utilized to 
develop the models. The database has previously been employed by Gandomi et 
al. [26] and Demir [42] to develop the LGP and ANN models, respectively. The 
database contains 89 and 70 test results respectively for the elastic modulus of 
normal-strength concrete (NSC) and high-strength concrete (HSC). Of the total 
159 data sets for HSC and NSC, 126 values were taken for the training of the al- 
gorithm (112 sets for learning and 15 sets for validation). The remaining 33 values 
were used for the testing of the derived models. To visualize the distribution of the 
samples, the data are presented by frequency histograms (Fig. 9.12). 

For the TGP analysis, the maximum tree depth was set to 10. The parameters 
setting of the LGP algorithm can be found in [26]. The characteristics of the best 
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Fig. 9.12 Histograms of: (a) compressive strengthi and (b) elastic modulus for all data 

MLP structure are given in [42]. For the RBF analysis, different spreads were 
checked and the optimum one was equal to 35. 

9.4.2.2 Comparison of the Results 

The TGP and LGP-based formulation of the E^ of concrete in terms of /^ are as 
given below: 



£,,,^,(CPfl) = 4V7^ + 9 



(9.12) 
(9.13) 



The elastic modulus predictions obtained by TGP, LGP, MLP and RBF are shown 
in Fig. 9.13. Statistical performance of different models in terms of their predic- 
tion capabilities is summarized in Table 9.3. As it seen, TGP, LGP, MLP and RBF 
give precise estimates of the target values. The performance of these techniques is 
fairly similar to each other. Overall, RBF has provided better results than other 
methods. 
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Fig. 9.13 Histograms of the CI models (a) TGP, (b) LGP, (c) MLP, and (d) RBF 
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Table 9.3 Statistical performances of the CI models 



Model 




Training data 






Testin 


g data 




R 


MAE 


Ave. 


Std. 


R 


MAE 


Ave. 


Std. 


GP techniques 


















TGP 


0.946 


2.382 


0.997 


0.096 


0.956 


2.254 


0.954 


0.174 


LOP 


0.946 


2.354 


1.001 


0.096 


0.957 


2.181 


0.957 


0.173 


ANN techniques 


















MLP 


0.952 


2.215 


1.000 


0.094 


0.966 


2.110 


0.951 


0.170 


RBF 


0.961 


2.031 


0.997 


0.083 


0.962 


2.276 


0.945 


0.171 



9.5 Discussion 

Based on a logical hypothesis [43], if a model gives R > 0.8, and the error values 
(e.g., MAE) are at the minimum, there is a strong correlation between the pre- 
dicted and measured values. The model can therefore be judged as very good. It 
can be observed from Figs. 9.10 and 9.13 and Tables 9.2 and 9.3 that all the GP 
and ANN-based models with very high R and low MAE values can accurately 
predict the target values. Meanwhile, it is noteworthy that the MAE values are not 
only low but also as similar as possible for the training and testing sets. This sug- 
gests that the proposed models have both predictive ability (low values) and gen- 
eralization performance (similar values). 

The task faced by the GP-based approaches is mainly the same as that faced by 
the ANN-based methods. GP and ANNs are machine learning techniques that can 
effectively be applied to the classification and approximation problems. They di- 
rectly learn from raw experimental (or field) data presented to them in order to 
extract the subtle functional relationships among the data, even if the underlying 
relationships are unknown or the physical meaning is difficult to be explained. 
Contrary to these methods, most conventional empirical and statistical methods 
like finite element method need prior knowledge about the nature of the relation- 
ships among the data [11]. Classical constitutive models rely on assuming the 
structure of the model in advance, which may be suboptimal. Therefore, the GP 
and ANN-based approaches are well-suited to modeling the complex behavior of 
most geotechnical engineering problems with extreme variability in their nature 
[44]. In spite of similarities, there are some important differences between GP and 
ANNs. ANNs suffer from some shortcomings including lack of transparency and 
knowledge extraction. That is, they do not explicitly explain the underlying physi- 
cal processes. The knowledge extracted by ANNs is stored in a set of weights that 
cannot properly be interpreted. Due to the large complexity of the network struc- 
ture, ANNs do not give a transparent function relating the inputs to the corre- 
sponding outputs. The main advantage of GP over ANNs is that GP generates a 
transparent and structured representation of the system being studied. An addi- 
tional advantage of GP over ANNs is that determining the ANN architecture is a 
difficult task. The structure and network parameters of ANNs (e.g. number of in- 
puts, transfer functions, number of hidden layers and their number of nodes, etc.) 
should be identified a priori, which is usually done through a time consuming trial 
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and error procedure [11]. In GP, the number and combination of terms are auto- 
matically evolved during model calibration [13, 44]. A notable limitation of GP 
and its variants is that these methods are parameter sensitive. The performance of 
the TGP and LGP algorithms employed herein can be improved by using any form 
of optimally controlling the parameters of the run (e.g., GAs). 

However, one of the goals of introducing the expert systems, such as the GP 
and ANN-based approaches, into the design processes is better handling of the in- 
formation in the pre-design phase. In the initial steps of design, information about 
the features and properties of targeted output or process are often imprecise and 
incomplete [11,45]. Nevertheless, it is idealistic to have some initial estimates of 
the outcome before performing any extensive laboratory or field work. The ap- 
proaches employed in this research are based on the data alone to determine the 
structure and parameters of the models. Thus, the derived models can particularly 
be valuable in the preliminary design stages. For more reliability, the results of the 
analyses are suggested to be treated as a complement to conventional computing 
techniques. In any case, the importance of engineering judgment in interpretation 
of the obtained results should not be underestimated. In order to develop a sophis- 
ticated prediction tool, TGP, LGP, MLP, and RBF can be combined with ad- 
vanced deterministic models. Assuming the deterministic model captures the key 
physical mechanisms, it needs appropriate initial conditions and carefully cali- 
brated parameters to make accurate predictions. An idea could be to calibrate the 
required parameters by the use of TGP, LGP, MLP, and RBF which take into ac- 
count historic data sets as well as the laboratory or field test results. This allows 
integrating the uncertainties related to in-situ conditions which the deterministic 
model does not explicitly account for. TGP and LGP provide a structured repre- 
sentation for the constitutive material model that can readily be incorporated into 
the finite element or finite difference analyses. In this case, it is possible to use a 
suitably trained GP-based material model instead of a conventional (analytical) 
constitutive model in a numerical analysis tool such as finite element code or fi- 
nite difference software (like FLAG) [11]. It is notable that the numerical imple- 
mentation of ANNs in the finite element analyses has already been presented by 
several researchers (e.g. [46]). This strategy has led to some qualitative improve- 
ment in the application of finite element method in engineering practice [13]. 

9.6 Conclusions 

In this study, the TGP, LGP, MLP, and RBF paradigms were utilized to assess dif- 
ferent characteristics of concretes. The following conclusions can be derived from 
the results presented in this research: 

1 . TGP, LGP, MLP, and RBF are effectively capable of predicting the compres- 
sive strength and elastic modulus of concrete. The validity of the derived mod- 
els was tested for a part of test results beyond the training data domain. In all 
cases, LGP gives more accurate predictions than TGP. RBF and MLP generally 
provide better results than TGP and LGP. 
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2. The proposed TGP, LGP, MLP, and RBF models simultaneously take into ac- 
count the role of several important factors representing the concrete behavior. 
The proposed TGP and LGP simplified formulations can reliably be employed 
for the pre-design purposes or may be also used as quick checks on solutions 
developed by more time consuming and in-depth deterministic analyses. 

3. Utilizing the models derived via the GP and ANN methods, the concrete com- 
pressive strength can easily be estimated from the design mixture basic proper- 
ties and subsequently the elastic modulus can be assessed using the compres- 
sive strength. Thus, there is no need to go through sophisticated and time- 
consuming laboratory tests. 

4. A substantial distinction of GP and ANN to the statistical techniques lies in 
their powerful abilities to model the complex behavior of the concrete without 
any need to pre-defined equations or simplifications. 

5. Although ANNs are successful in prediction, they usually do not give a certain 
function to calculate the outcome using the input values. Furthermore, they re- 
quire the structure of the neural network (e.g. number of inputs, transfer func- 
tions, number of hidden layers, etc.) to be identified a priori. On the other hand, 
the GP-based techniques provide greatly simplified prediction equations. 

6. The constitutive models derived using TGP, LGP, MLP, and RBF are basically 
different from the conventional constitutive models based on the first principles 
(e.g., elasticity and plasticity theories). One of the distinctive features of GP 
and ANN-based constitutive models is that they are based on the experimental 
data rather than on the assumptions made in developing the conventional mod- 
els [11]. Consequently, as more data becomes available, these material models 
can be improved by re-training the TGP, LGP, MLP, and RBF algorithms. 
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Chapter 10 

A New Approach to Network Optimization 
Using Chaos-Genetic Algorithm 

Golnar Gharooni-fard and Fahime Moein-darbari 



Abstract. Genetic Algorithms (GAs) have been widely used to solve network op- 
timization problems with varying degrees of success. Part of the problem with 
GAs lies in the premature convergence when dealing with large-scale and com- 
plex problems; Caught in local optima, the algorithm might fail to reach the global 
optimum even after a large number of iterations. In order to overcome the prob- 
lems with traditional GAs, a method is proposed to integrate Chaos Optimization 
Algorithms (COAs) with GA to fully exploit their respective searching advantag- 
es. The basic idea of COA is to transform the problem variables, by way of a map, 
from the solution space to a chaos space and to perform a search that benefits from 
the randomness, orderliness and ergodicity of chaos variable. In this chapter, we 
will first discuss network optimization in general, and then focus on how chaos 
theory can be incorporated into the GA in order to enhance its optimization capac- 
ities. We will also examine the efficiency of the proposed Chaos-Genetic algo- 
rithm in the context of two different types of network optimization problems, Grid 
scheduling and Network-on-Chip mapping problem. 

Keywords: network optimization. Genetic Algorithm, Chaos theory. Grid sche- 
duling, Network-on-Chip mapping problem. 

10.1 Introduction 

Network theory basically deals with problems that have a graph structure. Graphs 
are mathematical structures used to model pair wise relations between objects. 
They consist of points, and lines connecting pairs of points. The points are called 
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nodes or vertices and the lines are called arcs. The arcs may have a direction on 
them, in which case they are called directed arcs. If an arc has no direction, it is 
often called an edge. If all the arcs in a graph are directed, the graph is said to be 
directed (digraph). Graphs are among the most ubiquitous models of both natural 
and human-made structures. They can be used to model many types of relations 
and process dynamics in physical, biological and social systems. Many problems 
of practical interest can be represented by graphs [1]. In computer science, graphs 
are used to represent networks of communication, data organization, computation- 
al devices, the flow of computation, etc. Fig. 10.1 is an example of a network 
modeled with graphs. At any given time, a message may take a certain amount of 
time to traverse each line (due to congestion effects, switching delays, etc.). The 
expended time can vary greatly and telecommunication companies dedicate a sig- 
nificant amount of their resources tracking these delays. Assuming a centralized 
switcher knows these delays, there remains the problem of routing a call so as to 
minimize the delays. This is an example of a particular type of network model, 
called the shortest path which includes a network with weighted edges and two 
special nodes: a source and a destination. The goal is to find a path from the 
source to the destination with the minimum total weight. 



Source ( S 




Destination 



Fig. 10.1 A Phone network modeled by a graph 



Network problems that involve finding the least-cost solution to a problem 
where each solution is associated with a numerical cost are generally studied un- 
der combinatorial optimization which concerns the efficient allocation of limited 
resources to meet desired objectives when the values of some or all of the va- 
riables are restricted to be integral [2]. Still, in most such problems, there are 
many possible alternatives to consider and one overall goal determines which of 
these alternatives is best. 

Different approaches have been used to solve network optimization problems 
[3] among which are a large family of algorithms collectively labeled metaheuris- 
tics. A metaheuristic designates a computational method that optimizes a problem 
by iteratively trying to improve a candidate solution with regard to a given meas- 
ure of quality. Metaheuristics make few or no assumptions about the problem be- 
ing optimized and can search very large spaces of candidate solutions [4], [5]. Me- 
taheuristics can be used for the purpose of combinatorial optimization where an 
optimal solution is sought over a discrete search-space. Popular metaheuristics for 
combinatorial problems include Simulated Annealing (SA) [6], Genetic Algorithm 
(GA) [7]' Particle Swarm Optimization (PSO) [8], Ant Colony Optimization 
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(ACO) [9] and Tabu Search (TS) [10]. Since our focus here is on GAs, in the next 
section we will discuss them as one of the most popular metaheuristics used for 
optimization purposes. The interested reader is referred to [11] and [12] for more 
general surveys on the metaheuristics. 

10.2 Genetic Algorithms 

Genetic algorithms are inspired by the evolutionary theory of the origin of species 
which explains how weak and unfit species in nature face extinction by way of 
natural selection. Natural selection is the process by which traits become more or 
less common in a population due to consistent effects upon the survival or repro- 
duction of their bearers, the strong species. In the long run, species carrying the 
correct combination in their genes become dominant in their population. Some- 
times, during the slow process of evolution, random changes may occur in the 
genes. If these changes provide additional advantages in the challenge for surviv- 
al, new species evolve from the old ones. Unsuccessful changes are eliminated by 
natural selection. 

The concept of Genetic Algorithms (GAs) was introduced by John Holland in 
the early seventies as a special technique for function optimization [7]. In GA ter- 
minology, a solution vector is called an individual or a chromosome. Chromo- 
somes are made of discrete units called genes. Each gene controls one or more 
features of the chromosome. In the original implementation of GA by Holland, 
genes are assumed to be binary numbers. In later implementations, more varied 
gene types have been introduced. Normally, a chromosome corresponds to a 
unique solution in the solution space. The GA operates with a collection of chro- 
mosomes, called a population. The population is normally randomly initialized. 
As the search goes on, populations evolve to include fitter and fitter solutions, and 
eventually converge, to a single solution. 

The basic idea of a GA is that the genetic pool of a given population potential- 
ly contains the best solution, to a given adaptive problem, although this solution 
might not have been realized yet. The algorithm operates in an iterative manner 
and evolves a new generation from the current generation by applying genetic op- 
erators [13]. Given a clearly defined problem to be solved and strings of candidate 
solutions, a simple GA works as follows: 

1 . Initialize the population. 

2. Calculate the fitness value for each individual in the population. 

3. Reproduce selected individuals to form a new population. 

4. Perform crossover and mutation on the population. 

5. Loop to step 2 until some termination condition is met. 

In some GA implementations, operations other than crossover and mutation are 
carried out in step 4. Crossover is considered by many to be an essential operation 
of all GAs. It plays an important role in distributing the individuals over the space 
of interest through the GA. Termination of the algorithm is usually based either 
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on achieving a population member with some specified fitness or on running the 
algorithm for a given number of generations. Like many other metaheuristics, 
GAs do not guarantee an optimal solution is ever found. They often show a very 
fast initial convergence followed by progressive slower improvement. Therefore 
different techniques have been used to improve the results obtained from the GAs 
[14]. By introducing Chaos theory in the next section, we will explain how to in- 
tegrate this concept with GA, in order to enhance the quality of the solutions. 



10.3 Chaos Theory 

Chaos theory is the study of the behavior of dynamical systems that are highly 
sensitive to initial conditions. In common usage, "chaos" means "a state of disord- 
er", but the adjective "chaotic" is defined more precisely in chaos theory. 
Although there is no universally accepted mathematical definition of chaos, a 
commonly used definition describes, chaos as a non-periodic, long-term behavior 
in a deterministic system that exhibits sensitive dependence on initial conditions 
[15]. None-periodic long-term behavior means that the system's trajectory in phase 
space does not settle down to any fixed points or periodic orbits, as time tends to 
infinity. Deterministic systems can have no random (or probabilitistic) parameters. 
It is a common misconception that chaotic systems are noisy systems driven by 
random processes. The irregular behavior of chaotic systems arises from intrinsic 
nonlinearities rather than noise. Sensitive dependence on initial conditions, the 
proverbial "the butterfly effect", requires that trajectories originating from nearly 
identical initial conditions diverge exponentially. Despite what the name suggests, 
chaos is not the absence of order; it is a subtle state that is poised between order 
and randomness, with both aspects intermingled. 

If a chaotic system's behavior is plotted in a graph over an extended period, 
obscure patterns might emerge. When a bounded chaotic system does have some 
long term pattern, but not a simple periodic oscillation or orbit, it is said to have a 
strange attractor [16]. In other words, strange attractor is the natural shape of 
chaos. It is called strange because of its complex geometry, and it is an attractor 
because the system that it describes is always drawn to the behavior that it 
represents as if attracted to it. The mathematical model developed, called the "Lo- 
renz system' has been used as a paradigm for chaotic systems that satisfy the 
above definition. The Lorenz system consists of three first-order coupled differen- 
tial equations as follows 

— = aiy — x) 
dt ^-^ ^ 

^=x(p-z)-y (10.1) 

dz „ 

— = xy — Rz 



' The "Lorenz system" is named after the American meteorologist Edward N. Lorenz, who 
in 1963 discovered chaotic behavior in a computer study of weather. 
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where all a, p, P > 0, but usually o = 10, P = 8/3 and p is varied. The system exhi- 
bits chaotic behavior when p = 28 [15]. The Lorenz system has three dynamic va- 
riables, and consequently the state-space picture of such a system is three- 
dimensional. Plotting the trajectory of the Lorenz system in state space, shown in 
Fig. 10.2, reveals what was earlier defined as a strange attractor (the Lorenz chao- 
tic attractor). The map shows how the state of a dynamical system (the three 
variables of a three-dimensional system) evolves over time in a complex, non- 
repeating pattern. 




Fig. 10.2 The Lorenz attractor 

One-dimensional noninvertible maps are the simplest systems capable of gene- 
rating chaotic motion. As such, they serve as a convenient starting point for the 
study of chaos [17]. Here, we introduce some well known one-dimensional maps. 

Logistic Map. The logistic map proposed by Robert May is a polynomial map and 
is often cited as an example of how complex behavior can arise from a very sim- 
ple nonlinear dynamical equation [15]. This map is defined as 



^n+l 



= /(M.^n) = M^n(l -X.^) , <II<4: 



(10.2) 



where fih a control parameter, and x is a variable. Since the equation represents a 
deterministic dynamic system, it might seem like its long-term behavior can be 
predicted, but that is in fact not the case since its behavior is heavily dependent on 
the variations of [i. The value of the control parameter, determines whether x 
converges to a constant point, oscillates between two or more values, or behaves 
chaotically in an unpredictable pattern [18]. 



Tent Map. In mathematics, the tent map is an iterated function, in the shape of a 
tent, forming a discrete-time dynamical system. It takes a point x„ on the real line 
and maps it to another point as 
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1 (10-3) 

where |i is a positive real constant [19]. The tent map and the logistic map are to- 
pologically conjugate and thus their behavior under iteration is identical in this 
sense. Depending on the value of jx, the tent map demonstrates a range of dynami- 
cal behavior ranging from predictable to chaotic. 

Bernoulli Shift Map. The Bernoulli shift map belongs to a class of piecewise li- 
near maps which consist of a number of piecewise linear segments. This map is a 
particularly simple, consisting of two linear segments to model the active and pas- 
sive states of the source [20]. It is defined as follows 

< x„ < (1 - A) 






Sine Map. The sine map is described by the following equation 

x„+i = -sln(7rx„) (10.5) 

where < a < 4 . Qualitatively this map has the same shape as the logistic map. 

ICMIC Map. The iterative chaotic map with infinite collapses (ICMIC) has infi- 
nite fixed points in comparison with finite collapses one-dimensional maps [21], 
[22]. The ICMIC map is described by following equation 

^n+i — sin — (10.6) 

where a £ (0,oo) is an adjustable parameter. 
10.4 Chaos Optimization Algorithm (COA) 

In random-based optimization algorithms, the methods using chaotic variables in- 
stead of random variables are called Chaotic Optimization Algorithm (COA) [23], 
[24]. Originally proposed by Li and Jiang, COA searches the solution space based 
on the regularity of chaotic variables and more easily escapes local minima com- 
pared with stochastic optimization algorithm [25]. By means of ergodicity, regu- 
larity and semi-stochastic properties of chaos, the optimal solution migrates in a 
chaotic way among the local minima and finally converges to the global optimal 
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solution [26]. Experimental studies assert that the benefits of using chaotic signals 
instead of random signals are often evident although it is not mathematically 
proved yet [27]. The procedure of COA is demonstrated as followings: 

1. Set k = and /(xf) as a random solution in the problem domain and a 
chaotic variable < r; < 1, (i = 1,2, ■••,n). 

2. Map the chaotic sequences rj^ to X according to the characteristics of the 
particular problem. 

3. Compare the function value of f(x ) with f(xf), pick the better value 

and replace it with f(xf). Then replace X with xf . 

4. Apply one of the aforementioned chaotic equations (denoted by M) 

r^fc+i = M{rl') (10.7) 

Note that the interval of chaotic sequences is between and 1. 

5. Set k — k + 1 and loop back to step 2 until the termination condition is 
reached. 

Numerical results show that COA takes less iteration to reach to an optimum solu- 
tion than most global optimization methods [25]. However, COA has the deficien- 
cy of taking much time to get to the optimum value, which affects the speed of 
convergence [28]. To overcome this limitation, an improved chaos optimization 
method that combines COA and GA is presented in the next section. 

10.4.1 Chaos-Genetic Algorithm (CGA) 

The idea of using chaotic systems instead of random processes has recently been 
noticed in several fields, including optimization theory. The basic idea is to trans- 
form the variables of a problem from the solution space to chaos space and then 
perform a search to find a solution by virtue of the randomness, orderliness and 
ergodicity of the chaos variable. Although the COA has many advantages, it 
makes no use of the experiential information previously acquired [29]. Further- 
more, in GAs there is no guaranteed convergence even to a local minimum [30]. 
Since the genes from a few highly fit (but not optimal) individuals may rapidly 
come to dominate the population, causing it to converge on local minima and once 
the population has converged, the ability of the GA to continue to search for better 
solutions is largely compromised. 

In order to overcome the shortcomings of both COA and GA, one option is to 
integrate the two in order to bring together the searching advantages of both algo- 
rithms. The concept of Chaos-genetic algorithms (CGA), first introduced in [30], 
has the following characteristics: Firstly, CGA benefits from the characteristics 
of the chaotic variables to make the individuals of subgenerations distributed 
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ergodically in the defined space and thus to avoid premature convergence in the 
subgenerations. Secondly, according to its evolutionary nature, CGA maintains the 
fittest individuals in each run and hence increases the probability of finding the 
global optimal solution. CGA can be implemented by simply adding a chaotic 
mapping operator to the standard GA operators, namely crossover and mutation. 

As an example of a chaotic equation, the logistic map has been extensively ana- 
lyzed in the past decade. The evolution of the chaotic variables could be defined 
through the following equation [31], 



r, 



fc+i = 4rj''(l - r;'') , i = l,2,...,n. (10.8) 



In principle, this is the same as the equation introduced for logistic map in Section 
10.3. The value of the parameter fi — 4 is chosen in order for the system to act 
chaotically. Here Tj is the i-th chaotic variable and k denotes the number of itera- 
tions. The value of r; , is distributed in the range of [0 , 1] and n denotes the num- 
ber of genes in each chromosome. In order to perform the chaotic mapping, the 
following procedure is proposed. 

1. Divide the interval [0 ,1] to n equal sub-intervals, of which the lower 
limit [ai,a2, ...,a„] is represented by vector a, and the upper limit 
[bi, ^2, ... , fo„] by vector b. 

2. The real value of each Xi in the first randomly produced population is li- 
nearly mapped to new values of 1 < r; < 0, using 

ri= -^ fc - ad. (10.9) 

(2) 

3. The next iteration chaotic variables r> , will be produced through apply- 

ing the logistic map equation to r> values, generated in the previous 
section. 

(2') (2) 

4. The chaotic variables r> , are then used to produce x^ , using 

xf ^ = at + r[^\bi - at) , i = l,2, ...,n. (10.10) 

We can repeat the process in order to produce the next values of x^ . Although 
chaos variables are usually generated by the logistic map, there's no reason not to 
try any of the previously defined one-dimensional maps in order to form a chaotic 
mapping operator. Fig. 10.3 demonstrates a flowchart of the overall process of 
Chaos-genetic algorithm using the logistic map as a chaotic mapping operator to 
produce the chaotic population P2 from the randomly produced initial population 
P]. In the next section we will examine the performance of CGA in two types of 
network optimization problems, namely Grid scheduling and Network-on-Chip 
mapping problem. 
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Sett ^ l^and determine tm the number of geiiierations 
Generated random individuals -> PI 



ProduceSchaoticindividuals, using the chaotic mappingoperator -> P2 



Calculate the fitness of each individual in PI and P2 



Selects fittest individuals among PI and P2-> P3 



Perform genetic operators such as crossover, mutation, etc. to form a new population -^ P4 




* t-t-f-1 > P1^P4 



Return the fittest individual asa solution 



End 



Fig. 10.3 Chaos-Genetic Algorithm procedure 



10.5 Grid Scheduling: Case Study # 1 



A grid is a hardware and software infrastructure that provides dependable, consis- 
tent, pervasive, and inexpensive access to high-end computational capabilities 
[30]. It is a shared environment, implemented via the deployment of a persistent, 
standards-based service infrastructure that supports the creation and sharing of the 
resource within distributed communities. The resources might be computers, sto- 
rage space, instruments, software applications, and data, all connected through the 
Internet and a middleware software layer that provides basic services for security, 
monitoring, resource management, and etc. Resources owned by various adminis- 
trative organizations are shared under locally defined policies that specify what is 
shared, who is allowed to access what, and under what conditions [32]. 

From the point of view of scheduling systems, a higher level abstraction for the 
Grid can be applied by ignoring some infrastructure components such as authenti- 
cation, authorization, resource discovery and access control. Thus, the following 
definition for the term Grid is adopted in our study: "A type of parallel and distri- 
buted system that enables the sharing, selection, and aggregation of geographically 
distributed autonomous and heterogeneous resources dynamically at runtime de- 
pending on their availability, capability, performance, cost, and users' quality-of- 
service requirements" [33]. 
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To facilitate the discussion on grid sciieduling, we need to define some 
frequently used terms; tasks are atomic units to be scheduled by the scheduler and 
assigned to resources. The properties of a task are parameters like CPU/memory 
requirement, deadline, priority, etc. A job (metatask or application) is a set of 
atomic tasks that will be carried out on a set of resources. Resources are required 
to carry out an operation, for example: a processor for data processing, a data sto- 
rage device, or a network link for data transporting. A site (or node) is an auto- 
nomous entity composed of one or multiple resources. 

Based on the definitions above, task scheduling can be defined as the mapping 
of tasks to a selected group of resources which may be distributed in multiple ad- 
ministrative domains. Although, a grid is a system of high diversity, which is ren- 
dered by various applications, middleware components, and resources, we can still 
find a logical architecture of the task scheduling subsystem in the grid that as 
noted by Schopf in [34], can be generalized into three stages: 

1 . Resource discovering and filtering, 

2. Resource selecting and scheduling according to certain objectives, 

3. Job submission. 

Since the study of scheduling algorithms is our primary concern, we mainly focus 
on the second step. Scheduling of interdependent tasks in distributed heterogene- 
ous computing environments is well known to be an NP-hard problem [35]. Sev- 
eral heuristic algorithms have been applied to solve the scheduling problem. These 
can be classified into two major groups, in view of their main objectives. First, a 
group of works that only attempt to minimize workflow execution time, without 
considering user's budget. Min-Min, which sets the highest priority to tasks with 
the shortest execution time, and Max-Min, which sets the high priority to the tasks 
with the long execution times are two major heuristic algorithms employed for 
scheduling workflows on grids [36]. Sujferage, is another heuristic algorithm 
which sets high scheduling priority to tasks whose completion time by the second 
best resource is far from that of the best resource [36]. Another workflow schedul- 
ing algorithm developed by the authors of [37], is based on a Greedy Randomized 
Adaptive Search Procedure (GRASP). Another workflow level heuristic is a Hete- 
rogeneous-Earliest-Finish-Time {HEFT) algorithm proposed by Wieczorek et al. 
[38]. Second, a group of works which address scheduling problems based on us- 
er's budget constraints. Nimrod-G [39] schedules independent tasks for parameter- 
sweep applications to meet user's budget. More recently, LOSS and GAIN sche- 
duling approaches were developed, to adjust a schedule which is generated by a 
time-optimized heuristic and cost optimized heuristic to meet the user's budget 
constraints [40]. 



10.5.1 Challenges of Scheduling Algorithms in Grid Computing 

Although previous research in this area is of great value, traditional scheduling 
models generally produce poor grid schedules in practice [32]. To remedy this let 
us go through the assumptions underlying traditional systems: 
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• All resources reside within a single administrative domain. 

• To provide a single system image, the scheduler controls all of the re- 
sources. 

• The resource pool is invariant. 

• Contention caused by incoming applications can be managed by the 
scheduler according to some policies, so that its impact on the perfor- 
mance that the site can provide to each application can be well predicted. 

• Computations and their data reside in the same site. 

Unfortunately, not all of these assumptions hold in grid circumstances. There are 
unique characteristics in grid computing, listed by the authors of [41], which make 
the design of scheduling algorithms more challenging: 

• Heterogeneity and Autonomy. In grid computing, because resources are distri- 
buted in multiple domains on the Internet, heterogeneity is a characteristic not 
only of computational and storage nodes but also of the underlying networks con- 
necting them. This results in different capabilities for job processing and data 
access. The autonomy also gives way to aa diverse array of local resource man- 
agement techniques and access control policies, such as, priority settings for dif- 
ferent applications and resource reservation methods. Thus, a grid scheduler is 
required to be adaptive to different local policies. The heterogeneity and 
autonomy on the grid user side are represented by various parameters, including 
application types, resource requirements, performance models, and optimization 
objectives. 

• Performance Dynamism. Making a feasible scheduling usually depends on the 
performance estimate that candidate resources can provide, especially when the 
algorithms are static. Grid schedulers work in a dynamic environment where per- 
formance of available resources is constantly changing. The change comes from 
site autonomy and competition for resources by various applications. 

• Resource Selection and Computation.-Data Separation In traditional systems, 
executable codes of applications and input/output data are usually in the same site, 
or the input sources and output destinations are determined before the application 
is submitted. Thus the cost for data staging can either be neglected or is a constant 
determined before execution, and scheduling algorithms need not consider it. But 
in a grid which consists of a large number of heterogeneous computing sites (from 
supercomputers to desktops) and storage sites connected via wide area networks, 
the computation sites of an application are usually selected by the grid scheduler 
according to resource status and certain performance models. Additionally, in a 
grid, the communication bandwidth of the underlying network is limited and 
shared by a host of background loads, so the inter-domain communication cost 
cannot be neglected. 
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Many grid applications are data intensive, so the data staging cost is considera- 
ble. This situation brings about the computation-data separation problem: the ad- 
vantage brought by selecting a computational resource that can provide low 
computational cost may be neutralized by its high access cost to the storage site. 
These challenges depict unique characteristics of grid computing, and put signifi- 
cant obstacles to design and implement efficient and effective grid scheduling sys- 
tems. It is believed, however, that research achievements on traditional scheduling 
problems can still provide stepping-stones for a new generation of scheduling 
systems. 

In order to introduce the Chaos-genetic algorithm to solve the workflow sche- 
duling problem, we need to define an appropriate problem representation, fitness 
assignment, and genetic operators. These will be discussed in the following sub- 
sections. 

10.5.2 Problem Description 

As mentioned in the previous section, the scheduling problem becomes more chal- 
lenging because of some unique characteristics of grid computing. The grid sche- 
duling problem can be defined as follows: A workflow application can be modeled 
as a Directed Acyclic Graph (DAG). There is a finite set of tasks T, f / = 1,2, ..., n) 
and a set of directed arcs of the form ( Ti , Tj ), where T, is the parent task of Tj , 
and Tj is the child of T,. A child task can never be executed unless all of its parent 
tasks have been completed. Let B be the cost constraint (budget) and D the time 
constraint (deadline), specified by the user's workflow execution. The total num- 
ber of available services is shown by m. There's a set of services Sj ( j — 
1,2, ... , m) capable of executing task Tj, but each task can only be assigned for ex- 
ecution to one of these services. Services have varied processing capabilities deli- 
vered at different prices. We denote tj^ as the processing time, and c/ as the ser- 
vice price for processing 7; on service Sj. The scheduling problem is to map every 
Ti onto a suitable Sj in order to get the best trade-off between execution time and 
cost in a workflow considering the user's budget and deadline. 

10.5.3 The Chaos-Genetic Scheduling Algorithm (CGA) 

For a workflow scheduling problem, a feasible solution is required to meet several 
conditions: 

1. A task can only be started after all its predecessors have completed. 

2. Every task appears once and only once in the schedule. 

3. Each task must be allocated to one available time slot of a service capa- 
ble of executing the task. 
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Fig. 10.4 A sample workflow followed by a set of source-to-task assignments 

Each individual in the population represents a feasible solution to the problem, 
and consists of a vector of task assignments. Each task assignment includes four 
elements (task ID, service ID, start time, end time) [42]. The first two parameters 
identify to which service each task is assigned. Since involving time frames during 
the genetic operation may lead to a very complicated situation [43], we ignore the 
time frames here. Therefore, the operation strings (chromosomes) encode only the 
service allocation for each task and the order of the tasks allocated to each service. 
Different execution priorities of such parallel tasks within the workflow may im- 
pact the performance of workflow execution significantly. For this reason, the so- 
lution representation strings are required to show the order of task assignments on 
each service in addition to service allocation of each task. As suggested by Buyya 
[43], we create an array to represent a schedule as illustrated in Fig. 10.4. Each 
element of this array represents a service and the indexes refer to the task number. 

As stated earlier, the problem is to schedule a workflow execution considering 
both time and user budget constraints. The first decision to be made is how to 
represent the solution, which was shown in Fig. 10.4. Initializing the population is 
done randomly using a random generator to produce values between 1 to n. For 
each task, these random values are chosen from sources that are capable of execut- 
ing that task. The length of the chromosome depends on the number of tasks in the 
workflow. A chaotic mapping operator is then applied to the initial population, 
generating a new chaotic population. 

At this stage, the fitness of the individuals of the entire population is evaluated. 
The fitness value is often proportional to the output value of the function being op- 
timized according to the given objectives. As the goal of scheduling is to get the 
best trade-off between the time and cost of the workflow execution, the fitness 
function divides the evaluation into two parts [43]: cost- fitness and time-fitness. 
For budget constrained scheduling, the cost-fitness component produces results 
with less cost. The cost fitness function of an individual / is defined by 



Fcost 0) - — 



(10.11) 
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where c(I) is the sum of the task execution cost and data transmission cost of / and 
B is the budget of the workflow. For budget constrained scheduling, the time- 
fitness component is designed to produce individuals that satisfy the deadline con- 
straint. The time-fitness function of an individual / is defined by 

Pti^eO) = f (10.12) 

where t(I) is the completion time of /, D is the deadline of the workflow. The final 
fitness function combines the two parts and it is expressed as: 

(FcostO) + FtimeO), if F^ostO) > 1 Or FtimeO) > 1 

^«4^^x^^ otker^ise ^'^-''^ 

^ Tnaxcost maxtime 

where maxcost is the most expensive solution of the current population and max- 
time denotes the largest completion time in the current population. 

Elitism is incorporated into the algorithm by transferring the single fittest indi- 
vidual directly to the next generation. Crossover is used to create new solutions by 
rearranging parts of the existing solutions in the current population. The idea be- 
hind the crossover operation is that a higher quality solution may result from the 
combination of two of the current fittest solutions [44]. We have implemented a 
two-point crossover which is illustrated in Fig. 10.5. For population based algo- 
rithms, mutation occasionally occurs in order to allow a child to obtain features 
that are not possessed by either of its parents. This process helps the algorithm ex- 
plore new and possibly better genetic material than has been previously consi- 
dered. The process of mutation is shown in fig. 10.6. 
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Fig. 10.5 The Crossover operation: First, two random parents are chosen from the current 
population. Then two random points are selected from the schedule order of both parents. 
The locations of all tasks between the two parents are exchanged. Two new offsprings are 
generated by combining task assignments taken from two parents. 



Ti T; T. T4 T5 J, J J Ti T; T3 T, T, T, T, 

I^I^I^IH3|3|2| ^ I^IHl|n4|3|2| 

Parent Child 

Fig. 10.6 The Mutation operation: A task is randomly selected in a chromosome. An alter- 
native service which is also capable of executing the task is randomly selected to replace 
the cun'ent task allocation 
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The new population is now ready for another round of chaotic mapping, cros- 
sover, and mutation, producing yet another generation. So the initial population is 
replaced by the newly generated individuals. More generations are produced until 
the stopping condition (a maximum number of generations) is met. The fittest 
chromosome is thus returned as a solution. 



10.5.4 Experimental Results 

Given that different workflow applications may have different impact on the per- 
formance of the scheduling algorithms, we have evaluated algorithms on different 
workflow structures. According to many grid workflow projects [45], workflow 
applications can be categorized into balanced structures and unbalanced struc- 
tures. Fig. 10.7 shows balanced and unbalanced-structure applications used in our 
experiments. As shown in Fig. 10.7(a), the balanced-structure application consists 
of several parallel pipelines, which require the same types of services but process 
different data sets. As can be seen in Fig. 10.7(b), the structure of the unbalanced 
application is more complex. Unlike the balanced-structure application, many pa- 
rallel tasks in the unbalanced structure require different types of services, and their 
workload and I/O data varies significantly. 




Fig. 10.7 Workflow structures: (a) Balanced workflow (fMRI). (b) Unbalanced workflow 
(DNA) 



A Chaos-Genetic scheduling Algorithm (CGA) is introduced to solve the 
workflow execution planning problem. Our goal is to simultaneously minimize 
two conflicting objectives; execution time and execution price while meeting us- 
ers' maximum time constraint (deadline) and price constraint (budget). We have 
simulated 15 types of services with various price levels. The parameter settings 
used as a default configuration for the algorithms are listed in Table 10.1. The be- 
haviors of algorithms are also observed at three constraint levels, namely relaxed 
constraint, medium constraint, and tight constraint. The relaxed constraint level 
assumes that users require relatively large deadline and budget, while the tight 
constraint level assumes that users require small deadline and budget. In other 
words, the relaxed/tight deadlines and budgets of an application are determined by 
the maximum/minimum time and cost for the workflow execution. 
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Table 10.1 Parameter settings for workflow scheduling problem 



Parameter 


Value/type 


Population size 


10 


Initial population 


randomly generated solution 


Maximum Generation 


100 


Crossover Probability 


0.98 


Mutation Probability 


0.05 


Maximum Iteration 


10 



As it is illustrated in Fig. 10.8, neither of GA and CGA satisfy the low budget 
constraint (about G$3500), however CGA shows better results in both applica- 
tions. Results are gradually improved under medium budget constraints. Obvious- 
ly, the descending trend in the diagram shows that as the budget increases, it'll be 
easier for the algorithms to meet the user budget constraints. On the other hand, 
considering the differences between the two approaches, it is clear that GA takes 
longer to complete even under relaxed constraints. Therefore, CGA shows better 
performance compared to GA in both applications. 



fMRI Execution Cost 



DNA Execution Cost 




■ CGA o 



3500 5500 7500 

Budget 




3500 5500 7500 

Budget 



Fig. 10.8 Comparison between the execution cost of GA and CGA on balanced (fMRI) and 
unbalanced (DNA) workflows, under three constraint types: tight (GS3500), medium 
(GS5500) and relaxed (G$7500). Each experiment was repeated 10 times and the average 
values are used to report the results. For fMRI, the results are obtained under the assump- 
tion of D = 220(H) and D = 240(H) for DNA. The values of the vertical axes are the result 
of the total cost divided by the user budget constraint. 



In Fig. 10.9 a comparison between the execution times of the two algorithms on 
fMRI and DNA workflows is illustrated. Here we change the user deadline values 
from 190(H) to 290(H) for DNA and from 170(H) to 270(H) for fMRI, since the 
latter is a balanced workflow and takes less time to complete. It can be seen that 
GA takes longer to complete in most of the conditions. The differences are ob- 
viously better observed in the unbalanced workflow structure. 
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Fig. 10.9 Comparison between the execution time of GA and CGA on balanced (fMRI) and 
unbalanced (DNA) workflows, under three constraints: tight (around 180H), medium 
(around 230H) and relaxed (around 280H) with a medium budget of GS5000. Each experi- 
ment was repeated 10 times and the average values are used to report the results. 



In all of the above diagrams, there are conditions where CGA and GA show 
similar results (for instance in Fig. 10.9 for fMRI, under medium constraint). 
These are the conditions where GA solutions were not trapped in a local optimum, 
resulting in similar performance patterns for the two algorithms. In those condi- 
tions, CGA does not do any good in keeping the suitable solutions. In the rest of 
the states though, GA, is stuck somewhere in a local optimum (as it usually is), 
which prevents it from producing possible better results. In other words, CGA 
takes advantage of the characteristics of the chaotic variable to make the individu- 
als of subgenerations distributed ergodically in the defined space and thus to avoid 
premature convergence [30]. It also takes advantage of the convergence characte- 
ristic of GA to overcome the randomness of the chaotic process and hence to in- 
crease the probability of finding the global optimal solution. 



10.6 Network-on-Chip (NoC): Case Study # 2 



System-on-Chip (SoC) is a chip design method where all of the components of an 
electronic system are integrated into a single chip. Benefits of this integration 
compared with traditional multi-chip design include a size and energy reduction. 
An important concept in chip design is the core. A core is basically a separate and 
reusable unit of logic. Examples of cores include processors, memory banks and 
external communication components. These cores may be licensed from a number 
of vendors, under the common label Intellectual Property-cores {IP-cores). A Sys- 
tem on Chip can include many IP-cores that need to communicate with each other. 
This is traditionally done by shared buses and ad-hoc core to core links. Using 
such traditional communication structures, functions well without creating com- 
munication bottlenecks when a system has few cores [46]. But as the number of 
IP-cores increases, the number of potential connections between them increases 
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exponentially to a point where assigning the same bus to many IP-cores is not a 
practical option due to latency issues. 

Over time, traditional SoC communication methods gradually became ineffi- 
cient and complex, and do not scale well for large SoCs (say, more than 20 IP- 
cores) [46]. The increasing complexity of such systems leads to some difficulties 
in creating a proper communications infrastructure for the chip. When time- 
division buses and custom point to point communications are no longer sufficient, 
more elaborate networks are the obvious choice. By going beyond current buses 
and custom communication designs for the higher levels of interconnection on the 
chip, it might indeed be possible to reach higher performance with lower design 
and verification costs. A scalable communication architecture that avoids these 
problems is required and this is where creating a global network on the chip be- 
comes a viable option. 

10.6.1 Network on Chip 

Network on Chip (NoC) is an emerging communication method for a System on 
Chip [47]. NoC attempts to solve the communication problems mentioned in the 
previous section by creating an inter-chip network consisting of network adapters, 
routers/switches and links between them. Each IP-core is connected to a network 
adapter which converts the transaction data from the IP-core to the flow digits 
(flits) transmitted across the network. Fig. 10.10 illustrates the basic concepts in a 
NoC. 
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Fig. 10.10 An example of the Network-on-Chip architecture with 'S' for Switches, 'M' for 
Memory, 'Re' for Reconfigurable logic, 'mi' for resource-network interface and 'L' for 
dedicated hardware. 
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The individual IP-cores do not need to be aware of how the data is transmitted 
on the network. This decoupUng of the processing from communication is an im- 
portant benefit of NoC. This means that IP-cores with different transaction stan- 
dards can easily communicate with each other, simplifying the design process. 
Other benefits include shorter, simplified wiring and lower energy usage. There 
are also some potential drawbacks including increased delay/latency especially if 
the network is congested, and the extra space used on the chip for the routers and 
network adapters. However the overhead is estimated to be fairly small [48] and 
space is usually not the bottleneck in chip design especially with the continuing 
shrinking microchip technologies. 

The future for NoC looks promising, but many problems need to be addressed 
in order for it to find more widespread application. One of the problems is how to 
connect systems of IP-cores that vary in size and/or communication requirements 
(a heterogeneous SoC) defined as the network layout or topology selection. Three 
main factors that have to be taken into account when evaluating a NoC design are 
latency, energy usage and size (area overhead). Another important matter is appli- 
cation mapping which deals with finding the best node arrangement with the aim 
of improving the quality of service parameters. The mapping process can be de- 
scribed as follows: first select a set of IP-cores to distribute the data processing on 
and secondly construct a topology that connects the IP-cores and minimizes com- 
munication costs. The selected set of IP-cores and the data transmission between 
them constitutes what is defined as the Core Graph. 

10.6.2 Problem Description 

The investigation of different network topologies pointed to a two-dimensional 
mesh as the most suitable topology for most on-chip networks. This is also the 
common topology proposed by most researchers [48], [49], [50]. The main rea- 
sons for selecting the two-dimensional mesh instead of other topologies such as 
hypercubes, butterflies, or trees are that a two-dimensional mesh has an acceptable 
wire cost, reasonably high bandwidth, and a nice mapping onto a chip. Routing ei- 
ther refers to the problem of connecting a topology or choosing data transmission 
routes through a constructed topology. The transmission routes can either be static 
or dynamic. Dynamic routing is definitely more flexible [46] but requires more 
complex routers and larger buffers in the network. The static routing scheme cho- 
sen in the implementation of the algorithm is a shortest path routing algorithm. 

The input of our problem is a directed task graph G (7, £) , in which every 
Vi EV denotes a processing element or a memory unit (generally an IP core), and 
a directed edge e^ = (vi, i7y) denotes a communication trace from the source node 
Vi to the destination node Vj . The weights of the edges w(ei^) usually refer to the 
communication cost between two corresponding nodes. A mesh based topology of 
NoC is defined by (U, L) , where each vertex U; £ f/ denotes a node in the topolo- 
gy and each Zj £ L denotes a physical link between two vertices. The weight of a 
link w {Ij^) represents the bandwidth available across the link Ij^ . Fig. 10.11 exhi- 
bits the mapping process of a sample task graph onto a tile-based mesh structure. 
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Fig. 10.11 The mapping process 

In order to optimize the results of the mapping process, various authors have 
tried to enhance the results considering different performance elements. Lei and 
Kumar, proposed a two-step GA for mapping task graphs to the NoC architecture 
in [51], with the objective of minimizing the average communication delay of the 
network. Following the same objective, Murali and De Micheli, proposed NMAP, 
as a fast algorithm that maps the cores onto mesh NoC architecture under band- 
width constraints, in [52]. BMAP a binomial mapping and optimization algorithm 
that reduce the hardware cost of on-chip network infrastructure [53]. 

In Chaos-Genetic Mapping (CGMAP) approach [54], determining solution re- 
presentation is the first priority. The values of the genes in this problem can only 
be integer values between 1 and n (the value of n is proportional to the number of 
tiles in the mesh). The length of each chromosome depends on the number of 
nodes in the communication task graph. Population size is another important pa- 
rameter. In an actual application, it would be common to have somewhere between 
a few dozen and a few hundred individuals. For the purposes of this problem, we 
assume that the first population consists of 100 individuals. The initialization of 
the first population is done randomly by means of a random number generator 
which assigns values between 1 and n, to each of the n positions in every one of 
100 individuals. Then the chaotic mapping operator is applied to each individual in 
the initial population and creates the chaotic population. At this stage, the fitness 
of all 200 individuals is evaluated. The fitness value is often proportional to the 
output value of the function being optimized. Since data always take the shortest 
distance in the network and often more than one such path exists for data going 
from node v^ — (a^j.yj) to Vj — {^j,yj) , we estimated this hop distance as 



M(efc) = (|xi -x^l + (|yi -y^l), 



(10.14) 



and defined the fitness function as follows 



^ = I]vefcW(efc)M(efc) 



(10.15) 
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Elitism is incorporated into the algorithm by transferring the single fittest individ- 
ual directly to the next generation. Crossover and mutation are also performed on 
randomly selected individuals. The initial population is replaced by these newly 
generated individuals. Obviously, more generations are produced until the stop- 
ping condition (a maximum number of generations) is met. The fittest chromo- 
some is thus returned as a solution. 



10.6.3 Simulation Results 

The results of the execution of CGMAP on two benchmark applications are dem- 
onstrated in this section; a Video Object Plane Decoder (VOPD) with 16 IP-cores 
and 20 links and an MPEG-4 decoder with 12 nodes and 13 links. Fig. 10.12 is an 
instance of a VOPD task graph mapped onto a two-dimensional mesh using 
CGMAP. Afterwards the results are compared with those of previous mapping al- 
gorithms such as NMAP [52], BMAP [53], PBB [55], etc. using the same routing 
and scheduling characteristics. 
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Fig. 10.12 VOPD task graph and the place of each associated IP core in a 2-dimensionaI 
mesh 



Fig. 10.13 demonstrates the results of CGMAP compared with five other map- 
ping algorithms in both applications, considering the communication costs. As it is 
clear in the figure, CGMAP performs well in both applications. Table 10.2, shows 
a comparison between the hop counts of the three most efficient mapping algo- 
rithms for two benchmark applications. The hop count is a measure of distance 
across an IP-based network which keeps track of the number of intermediate 
devices (like routers) an IP packet has to pass through in order to reach its destina- 
tion. Generally speaking, the more hops data must traverse to reach their destina- 
tion, the greater the transmission delay incurred. Assuming the average hop count 
in NMAP is I, the table proves that using CGMAP decreases the hop number to 
an average of 0.97 in the first application (MPEG-4) and to 0.99, in the case of the 
second application (VOPD). 
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Fig. 10.13 Comparison between the communication costs of six mapping algorithm in NoC 
Table 10.2 Comparision between the average hop count of mapping algorithms in NoC 



NMAP BMAP 



CGMAP 



GMAP PMAP 



PBB 



VOPD 



MPEG - 4 



1.71 



0.99 



0.95 



1.33 



2.13 



1.69 



1.67 



1.02 



10.6.4 Performance Analysis of One-Dimensional Chaotic Maps 

In this section, the efficiency of the discussed chaotic maps is compared when 
used as an operator in GA [56]. Fig. 10.14 shows a convergence rate comparison 




Logistic Tent Sine Bsliift ICMIC 



Logistis Tent Sine Bsliift ICIMIC 



Fig. 10.14 Convergence rate and communication cost comparison between chaotic maps. 
The algorithm is run 10 times with each chaotic operator and the average results are de- 
picted. The benchmark application used in this section is VOPD. 
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between five one-dimensional chaotic maps. It is noticeable that, ICMIC and Tent 
map have the greatest convergence rates and the lowest convergence rate belongs 
to the Bernoulli Shift and the Logistic map. This means that in order to get the 
best results for this specific application, CGMAP should be implemented with the 
ICMIC map as a chaotic operator. This way the algorithm reaches an optimum so- 
lution within the shortest time period. 

Communication costs of executing CGMAP are also compared with each of the 
discussed chaotic maps and the average results are demonstrated in Fig. 10.14. 
The Sine map achieves the lowest communication cost among all and the Bernoul- 
li Shift costs a lot to complete the application. The main aim of this experiment 
was to prove that the choice of the most effective chaotic map is a function of the 
benchmark problem on the one hand and the main objective of the problem on the 
other. 



10.7 Concluding Remarks 

The chaos optimization algorithm adopts chaos variable to search, and the search 
goes on according to the regularity characteristics of the chaotic variables. Chaos 
variable's traversal property ensures that a true optimum solution can be found if 
allowed to run for sufficient time. Even if the optimization calculation time is li- 
mited we can get approximate solution with extremely good precision. 

Grid Scheduling and Network-on-Chip mapping problems both belong to the 
group of NP-complete problems, which are traditionally solved using metaheuris- 
tic algorithms such as GA. In this chapter a Chaos-Genetic Algorithm (CGA) was 
used in order to take advantage of the properties of the chaotic variables to make 
the search of optimal values in GA more effective and faster. This is done by de- 
signing a chaotic mapping operator, using one-dimensional chaotic maps and ap- 
plying it to the GA along with the common genetic operators, namely crossover 
and mutation. Experimental results were highly dependent upon the chaotic map 
that was used. Therefore, by prioritizing the favorites that one seeks, a chaotic eq- 
uation may be selected that is the most congruent with ones will. 
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